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Foreword 


SNOBOL4 is a computer programming language containing many features not 
commonly found in other programming languages. It evolved from SNOBOL [1,2,3]*, a 
language for string manipulation, developed at Bell Telephone Laboratories, Incorporated, 
in 1962. Extensions to SNOBOL through various versions have made it a useful tool in 
such areas as compilation techniques, machine simulation, symbolic mathematics, text 
preparation, natural language translation, linguistics, and music analysis. 

The basic data element of SNOBOL4 is a string of characters, such as this line of 
printing. The language has operations for joining and separating strings, for testing their 
contents, and for making replacements in them. If a string is a sentence, it can be broken 
into phrases or words. If it is a formula, it can be taken apart into components and 
reassembled in another format. A string can appear either as a literal or as the value of a 
variable. The literal form is indicated by enclosing the string in quotation marks: 

'THIS IS A STRING' 

The string value may be assigned to a variable: 

LINE = 'THIS IS A STRING' 

A common operation on a string is examination of its contents for a desired structure of 
characters. This structure, known as a pattern, can be as simple as a string or a given 
number of characters. A pattern also can be an extremely complicated expression 
consisting, for example, of a number of alternatives followed by another set of alternatives, 
all of which must begin a given number of characters from the end of the string. The 
pattern, as a data type, may also appear either in literal or variable form. The data type 
of a variable - string, pattern, or any other in the language - depends on the last value 
assigned to it. There are no type declaration statements for variables as in other 
programming languages. 

SNOBOL4 provides numerical capabilities with both integers and real numbers. Because 
the language is essentially character oriented, and since most numerical operations involve 
character counting, integers are more commonly used. Conversion among integers, real 
numbers, and strings representing integers or real numbers is performed automatically as 
required. The programmer may, in addition, define other data types, such as complex 
numbers, and provide operations for them. 

Often it is desirable to associate a group of items with one variable name through 
numerical indexing or some other identifying property. The SNOBOL4 array and table 
provide these capabilities with more flexibility than most programming languages. An 

* Numbers in brackets refer to references listed at the end of this manual. 
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array is a data element consisting of a set of pointers to other data elements, so that each 
array element may be any data type, even an array. An element of an array is referenced 
by using an integer index. A table is similar to an array, except that the reference value 
need not be an integer, but can be any of several other types. Conversion can be made 
between tables and arrays. 

Execution of SNOBOL4 programs is interpretive. Instead of compiling a program into 
actual computer instructions, the compiler translates the program into a notation the 
interpreter can easily execute. This makes it fairly simple to provide capabilities such as 
tracing of new values for variables, an operation that is quite difficult in noninterpretive 
systems. Another important product of interpretation is flexibility. Functions can be 
defined and redefined during program execution. Function calls can be made recursively 
with no special program notation. The language is extendable to new data types needed 
for a program through data type definition operations. Linked-list nodes and complex 
numbers are possible programmer-defined data types. Operations on these new data types 
can be defined as functions. 

This book is an instructional and reference guide, and provides many examples of usage 
of the language. The description of the language is complete and does not require 
familiarity with earlier versions of the language. Some familiarity with elementary concepts 
of programming is presumed, however. 

M. D. Shapiro 

Lafayette, Indiana 
May, 1970 




Preface 


The SNOBOL4 programming language has been developed over a period of years and 
new language features have been added from time to time during the course of this 
development. Consequently there are several somewhat different versions of the language in 
use. The first edition of this book, published in May, 1969, described Version 2. The 
description in this second edition corresponds to Version 3, released in December, 1969. 
Version 3 contains a number of features not available in Version 2. 

SNOBOL4 has been implemented on several different computers, including the IBM 
System/360, UNIVAC 1108, GE 635, CDC 3600, CDC 6000 series, PDP-10, Sigma 5/6/7, 
Atlas 2, and RCA Spectra 70 series. Implementations for other machines are in various 
stages of completion. These machines have different operating environments and character 
sets. As a result, implementations of SNOBOL4 vary from machine to machine in details 
of syntax, operating system interface, and so forth. This book corresponds to the 
implementation of SNOBOL4 developed at Bfell Telephone Laboratories, Incorporated on 
the IBM System/360 operating under OS. Sections of the manual containing language 
features particularly dependent upon this implementation make specific reference to this 
dependency. Program examples in this book were run on an IBM 360 Model 65. 
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CHAPTER 1 


Introduction to the SNOBOL4 Programming Language 


This chapter is an introductory overview of the SNOBOL4 programming language. It 
describes the format of statements, some of the operations, and some of the types of data 
handled by the language. Later chapters describe in more detail much of the material in 
this introductory chapter. 

A SNOBOL4 program consists of a sequence of statements. There are four basic types 
of statements; 

(1) the assignment statement, 

(2) the pattern matching statement, 

(3) the replacement statement, and 

(4) the end statement. 

The end statement terminates the program. 


1.1 Assignment Statements 

The simplest type of statement is the assignment statement. It has the form 
variable = value 

The assignment statement may be said to have the following meaning: ‘Let variable have 
the given value. ’ For example, let V have the value 5 , or 

V = 5 

The value may be given by an expression, consisting, for example, of arithmetic operations 
as in the statement 

W = 14 +(16 - 10) 

which assigns the value 20 to the variable W. Blanks are required around arithmetic 
operators such as + and -. The value need not be an integer, which is just one type of 
data handled by SNOBOL4. For example, the value may be a string of characters, 
indicated by enclosing quotes. An example is the assignment statement 

V = 'DOG' 
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which assigns the string DOG to the variable V . Various types of data and operations that 
may be performed on them are described later. 

Typically a variable is a name such as V, X, or ANS . Variables appearing explicitly in a 
program must begin with a letter which may be followed by any number of letters, digits, 
periods, and underscores. 

The value of a variable may be used in an assignment statement. Thus 
RESULT = ANS. 1 

assigns to the variable RESULT the Value of ANS. 1. (Quotation marks distinguish literal 
strings from variables.) 

Blanks are required to separate the parts of a statement. In an assignment statement, the 
equal sign must be separated from the variable on the left and the value on the right by 
at least one blank. 


1.2 Arithmetic 
1.2.1 Integers 

The arithmetic operations of addition, subtraction, multiplication, division, and 
exponentiation of integers may be used in expressions. The statements 

M = 4 

N = 5 

P = N*M/(N-1) 

assign the value 5 to P. While blanks are required between the binary operators and their 
operands, unary operators such as the minus sign must be adjacent to their operands. An 
example is the statement 

Q2 = -P / -N 

which assigns the value 1 to Q2. 

Arithmetic expressions can be arbitrarily complex. When evaluating arithmetic 
expressions, the natural order of operator precedence applies. The unary operations are 
performed first, then exponentiation (♦♦ or ! ), then multiplication, followed by division, 
and finally addition and subtraction. All arithmetic operations associate to the left except 
exponentiation. Hence, 

X = 2 ♦♦ 3 ♦♦ 2 

is equivalent to 


X 


2 ♦♦ (3 ♦♦ 2) 
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Parentheses may be used to emphasize or alter the order of evaluation of an expression. 

In the above examples all the operands are integers and the results are integers. The 
quotient of two integers is also an integer. The remainder is discarded. Thus 

Q1 = 5/2 

Q2 = 5 / -2 

give Q1 and Q2 the values 2 and -2 , respectively. 


1.2.2 Real Numbers 

Real operands are also permitted in arithmetic expressions. The statements 

PI = 3.14159 

CIRCUM = 2. * PI * 5. 

assign real values to PI and CIRCUM. 

If real numbers are mixed with integers in arithmetic expressions, the result is a real 
number. For example, the value of 

SUM = 16.4+2 

is 1 8.4 . 


1.3 Strings 

Expressions involving operands that are character strings are also permitted in 
assignment statements. For example, the assignment statement 

SCREAM = 'HELP' 
assigns the string HELP as the value of SCREAM. 

The string is specified by enclosing it within a pair of quotation marks. Any character 
may appear in a string. A pair of double quotation marks can be used instead of single 
quotation marks. This permits the use of quotation marks within a string as in the 
statements 

PLEA = 'HE SHOUTED, "HELP."' 

QUOTE = ' " ' 

APOSTROPHE = " ' " 

Single quotation marks are used in the examples given in this book where one type of 
quotation mark is sufficient. 



4 Introduction to the SNOBOL4 Programming Language 


1.3.3 


1.3.1 The Null String 

The null string, which is a string of length zero, is frequently used in SNOBOL4. With 
a few exceptions, explained later, all variables have the null string as their initial value. A 
variable can also be assigned the null string by a statement like 

NULL = 

or, more briefly, 

NULL = 

The variable NULL is used in many examples that follow to represent the null string. The 
null string is diflFerent from the following strings, each of which has length one: 

'O' 


1.3.2 Strings in Arithmetic Expressions 

Numeral strings can be used in arithmetic expressions with integers and real numbers. 
For example, as a result of the statements 

Z = '10' 

X = 5*-Z+'10.6' 

X has the value -39.4. Numeral strings representing integers can contain only digits and 
an optional preceding sign. Numeral strings representing real numbers must have at least 
one digit before the decimal point. Thus, the following strings cannot be used in 
arithmetic expressions: 

'1,253,465' 

'.364 E-03' 

The null string is equivalent to the integer zero in arithmetic expressions. 


1.3.3 String-Valued Expressions 

Concatenation is the basic operation for combining two strings to form a third. The 
following statements illustrate the format of an expression involving concatenation. 

TYPE = ' SEMI' 

OBJECT = TYPE 'GROUP' 


The resulting value of OBJECT is the string SEMIGROUP. Notice there is no explicit 
operator for concatenation. Concatenation is indicated by specifying two string-valued 
operands separated by at least one blank. 
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FIRST = 'WINTER' 

SECOND = 'SPRING' 

TWO.SEASONS = FIRST ',' SECOND 


are equivalent to 

TWO.SEASONS = 'WINTER,SPRING' 


Strings can also be concatenated with reals and integers as in 

ROW = ' K' 

NO. = 22 

SEAT = ROW NO. 

which gives SEAT the value K22 . 

In an expression involving concatenation and arithmetic operations, concatenation has 
the lowest precedence. Thus 

SEAT = ROW NO. + 4 / 2 

is equivalent to 

SEAT = ROW (NO. + (4 / 2)) 

or 

SEAT = 'K24' 


1.3.4 Input and Output of Strings 

Three variables provide means for reading and writing data. The variables OUTPUT and 
PUNCH are for printing and punching. Whenever either of them is assigned a string, 
integer or real value, a copy of the value is put out. 

OUTPUT = 'THE RESULTS ARE;' 

assigns THE RESULTS ARE ; to OUTPUT and also prints it. 

PUNCH = OUTPUT 

causes the same line to be punched on a card. The statements 

OUTPUT = 

PUNCH 

cause a blank line to be printed and a blank card to be punched. 

The variable INPUT is used for reading in strings. Each time the value of INPUT is 
required in a statement, another card is read in and the 80-character string on it is 
assigned as the value of INPUT. Thus 


PUNCH 


INPUT 
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punches a copy of the input card. 

Data cards to be read in occur immediately after the end statement that terminates the 
program. 


1.4 Pattern Matching Statements 

The operation of examining strings for the occurrence of specified substrings (i.e. pattern 
matching) is fundamental to the SNOBOL4 language. Pattern matching can be specified 
in two types of statements: 

(1) the pattern matching statment, and 

(2) the replacement statement. 

The pattern matching statement has the form 
subject pattern 

where the two fields are separated by at least one blank. The subject specifies a string that 
is to be examined, and the pattern can be thought of as specifying a set of strings. The 
statement causes the subject string to be scanned from the left for the occurrence of a 
string specified by the pattern. 

If 

TRADE = 'PROGRAMMER' 

the statement 

TRADE 'GRAM' 

examines the value of TRADE for an occurrence of GRAM . If 
PART = ' GRAM' 

then an equivalent statement is 
TRADE PART 

The following example illustrates a pattern matching statement in which the pattern is a 
string-valued expression. 

ROW = 'K' 

NO. = 20 

'K24' ROW NO. + 4 

The subject is a literal and the value of the expression is the string K24. 

Notice that there is no explicit pattern matching operator between the subject and the 
pattern. The two fields are separated by blanks. 
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If it is necessary to have concatenation in the subject, the expression must be enclosed 
within parentheses to avoid ambiguity. An example is 

TENS = 2 

UNITS = 5 

(TENS UNITS) 30 


On the other hand, a pattern formed by concatenation does not need parentheses. The 
following statements are equivalent: 

TENS UNITS 30 
TENS (UNITS 30) 


1.5 Replacement Statements 

A replacement statement has the form 
subject pattern = object 

where the fields are separated by at least one blank. Pattern matching is performed as in 
the pattern matching statement. If the pattern matching operation succeeds, the subject 
string is modified by replacing the matched substring by the object. For example, if 

WORD = 'GIRD' 

then the replacement statement 

WORD '!' = 'OU' 

causes the subject string GIRD to be scanned for the string I and then, since the pattern 
matches, I is replaced by OU . Hence WORD has as value the string GOURD . If the 
statement is 

WORD 'AB' = 'OU' 

the value of WORD does not change because the pattern fails to match. 

Another example of the use of replacement statements is given in the following sequence 
of statements 


HAND 

= 'AC4DAHKDKS' 

RANK 

= 4 

SUIT 

= 'D' 

HAND 

RANK SUIT = ' 


which replaces the substring 4D with the string AS. 

A matched substring is deleted from the subject string if the object in the replacement 
statement is the null string. Thus 


HAND RANK SUIT 
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deletes 4D from HAND leaving it with the string ACAHKDKS as value. 


1.6 Patterns 

The patterns in the preceding examples specify single strings. It is also possible to 
specify more complex patterns. There are two operations available for constructing such 
patterns: 

(1) alternation, and 

(2) concatenation. 

Alternation is indicated by an expression of the form 
P1 I P2 

where the two patterns PI and P2 are separated from the | by blanks. The value of the 
expression is a pattern structure that matches any string specified by either Pi or P2 . For 
example, the statement 

KEYWORD = 'COMPUTER' | 'PROGRAM' 

assigns to KEYWORD a pattern structure that matches either of these two strings. 
Subsequently, KEYWORD may be used wherever patterns are permitted. For example, 

KEYWORD = KEYWORD | 'ALGORITHM' 

gives KEYWORD a new pattern value equivalent to the value assigned by executing the 
statement 

KEYWORD = 'COMPUTER' | 'PROGRAM' | 'ALGORITHM' 

Using KEYWORD in the pattern field, the statement 
TEXT KEYWORD = 

examines the value of TEXT from the left and deletes the first occurrence of one of the 
alternative strings. If 

TEXT = 'PROGRAMMING ALGORITHMS FOR COMPUTERS' 
the result of the replacement statement is as if the following statement were executed: 

TEXT = 'MING ALGORITHMS FOR COMPUTERS' 

Concatenation of two patterns. Pi and P2, is specified in the same way as the 
concatenation of two strings: 

PI P2 

That is, the two patterns are separated by blanks. The value of the expression is a pattern 
that matches a string consisting of two substrings, the first matched by Pi , the second 
matched by P2 . For example, if 
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BASE = 'BINARY' | 'DECIMAL' | 'HEX' 

SCALE = 'FIXED' j 'FLOAT' 

ATTRIBUTE = SCALE BASE 

and 

DCL = 'AREAFIXEDDECIMAL' 
then the pattern match succeeds in the statement 
DCL ATTRIBUTE 

Concatenation has higher precedence than alternation. Thus 

ATTRIBUTE = 'FIXED' | 'FLOAT' 'DECIMAL' 

matches FIXED or FLOATDECIMAL . The order of evaluation may be altered by using 
parentheses. 

ATTRIBUTE = ('FIXED' [ 'FLOAT') 'DECIMAL' 
matches either FIXEDDECIMAL or FLOATDECIMAL . 


1.7 Conditional Value Assignment 

It is possible to associate a variable with a component of a pattern such that if the 
pattern matches, the variable is assigned the substring matched by the component. The 
operator . is the conditional value-assignment operator and it is used in an expression of 
the form 

pattern , variable 

where the operator is separated from its operands by blanks. For example 
BASE = ('HEX' I 'DEC') . B1 

assigns to BASE a pattern that matches either HEX or DEC. If BASE is used successfully in 
a pattern match, the value of B1 is set to the substring matched by BASE. 

The operator . associates to the left, and has higher precedence than concatenation and 
alternation. 

A.OR.B = A I B . OUTPUT 

is equivalent to 

A.OR.B = A I (B . OUTPUT) 

which assigns to A. OR. B a pattern that matches the value of A or B . If B matches, the 
substring matched is printed. 
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There is also an operator $ for immediate value assignment which assigns value to a 
variable if the associated component of the pattern matches regardless of whether the 
entire pattern matches. Immediate value assignment is discussed in more detail later. 


1.8 Flow of Control 

A SNOBOL4 program is a sequence of statements terminated by an end statement. 
Statements are executed sequentially unless otherwise specified in the program. Labels and 
gotos are provided to control the flow of the program. 

A statement may begin with an identifying label, permitting transfer to the statement. 
For example, the assignment statement 

START TEXT = INPUT 

has the label START . A label consists of a letter or a digit followed by any number of 
other characters up to a blank. Blanks separate the label from the subject. A statement 
with no label must begin with at least one blank. The end statement is distinguished by 
the label END , indicating the end of the program. 

Transfer to a labelled statement is specified in the goto field which may appear at the 
end of a statement and is separated from the rest of the statement by a colon. Two types 
of transfers can be specified in the goto field: conditional and unconditional. 

A conditional transfer consists of a label enclosed within parentheses preceded by an F 
or S corresponding to failure or success. An example is the statement 

TEXT = INPUT :F(DONE) 

This statement causes a record to be read in and assigned as the value of TEXT. If, 
however, there is no data in the input file, i.e. an end of file is encountered, no new value 
is assigned to TEXT . Then, because of the failure to read, transfer is made to the 
statement labelled DONE. 

A use of the success goto is illustrated in the following program which punches a copy 
of the input file. 

LOOP PUNCH = INPUT :S(LOOP) 

END 

The first statement is repeatedly executed until the end of file is encountered. Then the 
program flows into the end statement causing the program to terminate. 

The success or failure of a pattern match can also be used to control the flow of a 
program by conditional gotos. For example 

COLOR = 'RED' I 'GREEN' | 'BLUE' 

BRIGHT TEXT COLOR = :S(BRIGHT)F(BLAND) 

BLAND 




1.9 


Indirect Reference 11 


All occurrences of the strings RED, GREEN, and BLUE are deleted from the value of 
TEXT before the pattern fails to match. Control then passes to the statement labelled 
BLAND. Both success and failure gotos can be specified in one goto field, and may appear 
in either order. 

An unconditional transfer is indicated by the absence of an F or S before the enclosing 
parentheses. For an example of an unconditional transfer, consider the following program 
that punches and lists a deck of cards. 

LOOP PUNCH = INPUT :F(END) 

OUTPUT = PUNCH :(LOOP) 

END 

The goto field in the second statement specifies an unconditional transfer. 


1.9 Indirect Reference 

Indirect referencing is indicated by the unary operator $. For example, if 
MONTH = 'APRIL' 

then $MONTH is equivalent to APRIL. That is, the statement 
$MONTH = 'CRUEL' 

is equivalent to 

APRIL = 'CRUEL' 

The indirect reference operator can also be applied to a parenthesized expression as in 
the statements 

WORD = 'RUN' 

$(WORD ':') = $(WORD ':') + 1 

which increment the value of RUN: . 

In general, the unary operator $ generates a variable that is the value of its operand. 
The expression 

$('A' I 'B') 

is erroneous because the value of the operand of $ is a pattern, not a string. Indirect 
reference in a goto is demonstrated by 

N = N + 1 :($('PHASE' N)) 

If, for example, the assignment statement sets N equal to 5', then the transfer is to the 
statement labelled PHASES. 
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1.10 Functions 

Many SNOBOL4 procedures are invoked by functions built into the system, called 
primitive functions. Operations that occur frequently are implemented as primitive 
functions for efficiency. Other primitive functions are used to invoke more complex 
operations that are fundamental to the language, affect parameters and tables internal to 
the system, and perform operations that could not be programmed in source language by 
other means. In addition, facilities are available for a programmer to define his own 
source-language functions. 


1.10.1 Primitive Functions 

The primitive function SIZE has a single string argument and returns as value an 
integer that is the length (number of characters) of the string. The statements 

APE = 'SIMIAN' 

OUTPUT = SIZE(APE) 

print the number 6 . 

Arguments to all functions are passed by value, and an arbitrarily complex expression 
may be used in the argument. Thus the statements 

N = 100 

OUTPUT = SIZE('PART' N + 4) 

print the number 7, because the value of the argument is the string PART 104. 

The argument of SIZE is supposed to be a string. Therefore, a call of the form 
SIZEl'APE' I 'MONKEY') 

is erroneous because the value of the argument is a pattern. 

DUPL is another function that performs an operation that is frequently required. 
DUPL (string, integer) returns as value a string that consists of a number of 
duplications of the string argument. The value of 

DUPL('/*',5) 

is /*/*/*/*/*. DUPL returns the null string if the second argument is zero, and fails 
if it is negative. The statement 

OUTPUT = DUPL(' ',40 - SIZE(S)) S 

prints the string S right justified to column 40 if its length is not greater than 40. 
Otherwise the statement fails, and S is not printed. 

REPLACE is a function called with three string-valued arguments. 

REPLACE(TEXT,CH1,CH2) 
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returns as value a string which is the same as TEXT, except that each occurrence of a 
character appearing in CH1 is replaced by the corresponding character in CH2 . For 
example, the statements 

STATEMENT = ’A(I,J) = A(l,j) + 3' 

OUTPUT = REPLACE(STATEMENT 

print the line 

A<I,J> = A<I,J> + 3 

If the last two arguments of the function call do not have the same length, the function 
fails. Function failure, like input failure, can be used in a conditional transfer. 

There are also several functions that return patterns as their values. LEN is such a 
function. LEN( integer) returns a pattern that matches any string of the length specified 
by the integer. 

The following example punches a card with the first 40 characters from a card that is 
read in. 


INPUT LEN(40) . PUNCH 


1.10.2 Predicates 

A predicate is a function or operation that returns the null string as value if a given 
condition is satisfied. Otherwise it fails. 

LE is an example of a predicate used for comparing numbers. 

LE(N1,N2) 

returns the null string as value if N1 is a number less than or equal to N2. N1 and N2 
may be either integer or real. Thus 

PUNCH = LE(SIZE(TEXT),80) TEXT 

punches the string TEXT if its length is not greater than 80. The null string value of the 
predicate does not affect the string that is punched. If the predicate fails, no assignment is 
made to PUNCH , and no card is punched. 

The success or failure of a predicate can be used with a conditional goto to control the 
flow of a program. For example, 

SUM = 0 

N = 0 

ADD N = LT(N,50) N + 1 :F(D0NE) 

SUM = SUM + N :(ADD) 

DONE OUTPUT = SUM 


sums the first 50 integers. Iteration continues as long as N is less than 50. When the 
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predicate fails, the conditional transfer to DONE is performed and the string 127 5 is 
printed. 

There are several predicates for comparing data objects. For example, 

DIFFER(ST1,ST2) 

returns the null string as value if the values of two arguments are not identical. Thus 

OUTPUT = DIFFER(FIRST,SECOND) FIRST SECOND 

concatenates the values of FIRST and SECOND if they are not the same, and then prints 
them. The predicate IDENT is the converse of DIFFER. IDENT fails if the values of its 
arguments are not identical. 

For all functions, an omitted argument is assumed to be the null string. Thus 
PUNCH = DIFFER(TEXT) TEXT 
punches the value of TEXT if it is not the null string. 

LGT is a predicate that lexically compares two strings. 

LGTlSTI,ST2) 

succeeds if ST1 follows (is lexically greater than) ST2 in alphabetical order. The 
statements 

OUTPUT 
OUTPUT 
OUTPUT 
SKIP OUTPUT 
JUMP 

print the values of TEXT 1 and TEXT2 in alphabetical order. 






LGT(TEXT 1,TEXT2) TEXT2 ;S(SKIP) 

TEXT1 

TEXT2 :(JUMP) 

TEXT1 


1.10.3 Defined Functions 

The SNOBOL4 language provides the programmer with the capability to define 
functions in the source language. This feature facilitates the organization of a program 
and may improve its efficiency. 

A programmer may define a function by executing the primitive function DEFINE to 
specify the function name, formal arguments, local variables, and the entry point of the 
function. The entry point is the label of the first of a set of SNOBOL4 statements 
constituting the procedure for the function. 

The first argument of DEFINE is a prototype describing the form of the function call. 
The second argument is the entry point. For example, execution of the statement 

define!'delete!string,char)','D1') 
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defines a function DELETE having two formal arguments, STRING and CHAR, and entry 
point D1 . The statements 

D1 STRING CHAR = ;S(D1) 

DELETE = STRING ;(RETURN) 

form a procedure that deletes all occurrences of CHAR from the value of STRING. The 
statement assigning the resulting value to the variable DELETE illustrates the SNOBOL4 
convention for returning a function value. The function name may be used as a variable 
in the function procedure. Its value on return from the procedure is the value of the 
function call. Return from a procedure is accomplished by transfer to the system label 
RETURN. 

If the second argument is omitted from the call of DEFINE, the entry point to the 
procedure is taken to be the same as the function name. For example 

define!'delete(string,char)') 

could have the procedure 

DELETE STRING CHAR = ;S(DELETE) 

DELETE = STRING ;(RETURN) 

A call of the function is illustrated in the following statements 

MAGIC = 'ABRACADABRA' 

OUTPUT = DELETE(MAGIC,'A') 

which print BRCDBR. 

Arguments are passed by value and may be arbitrarily complex expressions. Thus the 
statement 

TEXT = DELETE(delete!INPUT, '.'), ' ') 

deletes all periods and blanks from the input string. 

Functions can also fail under specified conditions. As an example, consider the following 
version of DELETE, which fails if STRING does not contain an occurrence of CHAR. 

DELETE STRING CHAR = ;F(FRETURN) 

D2 STRING CHAR = :S(D2) 

DELETE = STRING ;(RETURN) 

The transfer to the system label FRETURN indicates failure of the function call. 
Consequently, 

PUNCH = DELETE(INPUT,’*') 

punches a card only if the input string contains an *. 

Arguments to a function and the value returned can be any type of data object. 
Consider, for example, the function MAXNO where MAXNO ( P , N) returns a pattern that 
matches up to N adjacent strings matched by the pattern P . That is, if 
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PAT = MAXNO ('A' | 'B' | 'C , 2 ) 

then in the statement 

'EBCDIC' PAT 'D' 

the pattern match succeeds with PAT matching the string BC. 

MAXNO has the defining statement 
define!'MAXN 0(P,N)') 
and the procedure 

MAXNO N = GT(N,0) N - 1 ;F(RETURN) 

MAXNO = NULL | P MAXNO :(MAXNO) 

Consider the function REVERSE that reverses a string. It has the defining statement 

define!'reverse(string)','R1') 

and the procedure 

R1 ONECH = LENd) . CH 

R2 STRING ONECH = ;F!RETURN) 

REVERSE = CH REVERSE :!R2) 

There are two variables, ONECH and CH , used in the function definition in addition to the 
function name and formal argument. It is prudent to protect these variables so their use 
outside the function is not affected when the function is called. This is accomplished by 
declaring them to be local variables in the defining statement: 

define!'reverse(string)ONECH,CH','R l') 

When the function is called, the current values of the function name, the formal 
arguments, and the local variables are saved before the procedure is entered. These values 
are restored upon return from the procedure. This permits the programmer considerable 
freedom in defining functions. For example, a function can be recursive, i.e. include a call 
of the function itself. Consider the binomial coefficient c!n,ni) which can be defined by 
equations 

c!n,0) = 1 

c!n,m) = n*c!n-1 ,ni-1 )/in for m > 0 

Computational efficiency can be improved by employing the relation 
c!n,ni) = c!n,n-ni) 

for m > n/2 . 

The corresponding programmer-defined function consists of the defining statement 
define!'C!N,M)') 
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and the procedure 

C M = LT(N - M,M) N - M 

C = EQ(M,0) 1 ;S(RETURN) 

C = N ♦ C(N - 1,M - 1) / M ;(RETURN) 

COMB is an example of another recursively defined function. COMB(STR,N) lists all 
combinations of N characters from the string STR. The defining statement and procedure 
are 

DEFINE('COMB(STR,N,HEAD)CH') 

and 

COMB OUTPUT = EQ(N,0) HEAD :S(RETURN) 

C2 STR LE(N,SIZE(STR)) LEN(1) . CH = ;F(RETURN) 

COMB(STR,N - I.HEAD CH) :(C2) 

Then 

COMB('ABCD',3) 

prints 

ABC 

ABD 

ACD 

BCD 


Notice that COMB is defined with three formal arguments but only two values are 
supplied in the initial call. The missing value is taken to be null. 


1.11 Keywords 

Several parameters and switches internal to the SNOBOL4 system can be accessed by 
means of keywords. Keywords are specified by prefixing an ampersand to certain 
identifiers. For example, if the value of the keyword £DUMP is a nonzero integer when a 
program terminates, a dump of natural variables is printed. Thus the execution of the 
statement 

6DUMP = 1 

indicates that a dump is to be produced. 

Strings read in by INPUT are 80 characters long. Such strings often contain many 
unwanted trailing blanks. 6TRIM is a keyword that controls the handling of trailing 
blanks on input of data. If the value of 6TRIM is nonzero, trailing blanks are deleted. 
Thus 


STRIM 


1 
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causes trimming of trailing blanks. If 6 TRIM is zero, trailing blanks are not trimmed. 
Other keywords are described elsewhere in this book. 


1.12 Arrays 

Arrays of variables can be created by using the primitive function ARRAY . The 
arguments of ARRAY describe the number of dimensions, the bounds of each dimension, 
and the initial value of each variable in the array. Thus 

V = ARRAY(10,1.0) 

creates and assigns to V a one-dimensional array of ten variables, each initialized to the 
real value 1.0. The created variables can be referenced by expressions of the form V<I> 
where I = 1 , . . . , 10 . The statement 

N = ARRAY('3,5') 

creates a 2-dimensional array of variables 

N<1,1> N<1,2> N<1,3> N<1,4> N<1,5> 

N<2,1> .... 

N<3,1> . . . N<3,5> 

The omission of the second argument causes each of the variables to have the null string 
as initial value. The arguments in the call of ARRAY can be expressions. Thus 

6TRIM = 1 

A = ARRAY(INPUT) 

create an array with dimensionality that is data dependent. An array reference, A<I> , that 
is outside the bounds of the array causes failure that can be used to control program flow. 
The statements 

6TRIM = 1 

I = 1 

ST = ARRAY(INPUT) 

MORE ST<I> = INPUT 

1 = 1 + 1 
GO 

generate an array ST , and assign values to each of the variables. When all the variables 
in the array are assigned values, or an end of file is encountered, the transfer to GO is 
executed. 


:F(GO) 

: (MORE) 
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1.13 Tables 

Sets of pairs of associated data objects can be created by the use of tables. A table is 
similar to a one-dimensional array. However, instead of referencing an element with an 
integer, any data object can be used. 

A table is created by the primitive function TABLE . For example 

T = table!) 

creates a table and assigns it as value of T. 

Elements in the table can be assigned values in a manner similar to array elements. 
Examples are 

T<'A'> = 5 

and 

T<WORD> = T<WORD> + 1 

Tables have varying lengths and are dynamically extended if necessary. Some efficiency 
can be achieved by specifying an estimate of the size of a table at the time it is created. 
TABLE (N) creates a table that initially has room for N entries. 


1.14 Programmer-Defined Data Types 

Integers, reals, strings, patterns, arrays, and tables are types of data objects that are 
built into the SNOBOL4 language. Facilities are provided in the language to permit a 
programmer to define additional types of data. This facilitates representation of structural 
relationships inherent in data. 

For example, a simple linear linked list is made up of nodes, each containing a value 
field and a link field. 



The primitive function DATA can be used to define the data type NODE and the two 
field functions, VALUE and LINK. 

data!'node!value,link)') 

The statement 

P = NODE!'S',) 

creates a node with value field S and the null string in the link field. The value of P is a 
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data object with two fields that can be referenced by means of the function calls 
VALUE(P) and LINK(P) . The insertion of a node with value T at the head of the list is 
accomplished by the statement 

P = NODE('T',P) 

The following statement deletes a node from the head of the list 
P = LINK(P) 


1.15 Program Format 

A statement that is longer than one line can be continued onto successive lines by 
starting the continuation lines with a period or plus sign. An example is 

OUTPUT = 'THE TOTAL NUMBER OF OCCURENCES IS ' 

+ SUM<N> 

Plus signs are used for continuation in examples in this book. 

When continuing a statement over a line boundary, the statement may be broken 
wherever a blank is required. 

Several statements may be placed on one line by using semicolons which indicate the 
ends of statements. An example is 

X = 2; Y = 3; Z = 10 

A line beginning with an asterisk is treated as a comment and does not affect the 
operation of the program. 


1.16 Program Example 


This is an example of a complete SNOBOL4 program illustrating the use of comment 
lines, continuation lines, and the end statement. The program reads in data cards that 
follow the end statement. 




* * 

* THIS PROGRAM COUNTS THE NUMBER OF TIMES EACH * 

* LETTER IS USED IN INPUT TEXT. * 

* « 


4eitt:|c4c4c4c4e4e4ci)t:|titt:|c4c4citi4c4e4i4i4c4c4t4ci(t)|i4c4c)|c)|i)|(4i4t)|c)|(4(4c4i4e4e4c4c4t)|(4(4c4ei(t4c4t4c4i4c4c4eitt%)|i4t4e4e4e:|c4(4(:|c4e4c4e4c4c4i 


£TRIM 

CHAR 

LETTERS 

COUNT 

READ OUTPUT 


1 

LEN(1) . CH 

' ABCDEFGHIJKLMNOPQRSTUVWXYZ' 
table!30) 

INPUT :F(DISPLAY) 
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TEXT 

= OUTPUT 


NEXT 

TEXT 

CHAR = 

:F(READ) 


COUNT<CH> 

= COUNT<CH> + 1 

:(NEXT) 

DISPLAY 

OUTPUT 

= 


LOOP 

LETTERS 

CHAR = 

;F(END) 


OUTPUT 

= NE(COUNT<CH>) CH 

' OCCURS ' COUNT<CH> 

+ 



:(LOOP) 

END 





The program output follows, indicating the text read in and the resulting letter count. 

"THE WORLD OF THE FUTURE WILL BE AN EVER MORE DEMANDING STRUGGLE AGAINST 
THE LIMITATIONS OF OUR INTELLIGENCE, 

N. WEINER 


A 

OCCURS 

5 

TIMES 

B 

OCCURS 

1 

TIMES 

C 

OCCURS 

1 

TIMES 

D 

OCCURS 

3 

TIMES 

E 

OCCURS 

15 TIMES 

F 

OCCURS 

3 

TIMES 

G 

OCCURS 

5 

TIMES 

H 

OCCURS 

3 

TIMES 

I 

OCCURS 

9 

TIMES 

L 

OCCURS 

7 

TIMES 

M 

OCCURS 

3 

TIMES 

N 

OCCURS 

9 

TIMES 

0 

OCCURS 

6 

TIMES 

R 

OCCURS 

7 

TIMES 

S 

OCCURS 

3 

TIMES 

T 

OCCURS 

9 

TIMES 

U 

OCCURS 

4 

TIMES 

V 

OCCURS 

1 

TIMES 

W 

OCCURS 

3 

TIMES 


1.17 Conclusion 

This chapter has presented only an introduction to the main features of SNOBOL4. 
Detailed descriptions of many features are postponed to subsequent chapters, which 
provide a guide for the programmer. 

Chapter 2, the longest and most involved, describes pattern matching. Primitive 
functions, predicates, and operators are considered in Chapter 3. Programmer-defined 
functions are covered in Chapter 4. Arrays, tables, and programmer-defined data types are 
described in Chapter 5. Chapter 6 discusses keywords and introduces names and code. 
Chapter 7 contains information on types of data and data type conversion. Tracing 
facilities are described in Chapter 8, and Chapter 9 covers input and output. The 
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execution of a SNOBOL4 program, including error messages and error control are the 
subject of Chapter 10. Finally, Chapter 11 discusses matters of efficient programming and 
related topics. 

In addition to numerous examples included in the text, exercises are provided at the 
ends of chapters. 

Appendices at the end of this book provide additional detailed material. Appendix A 
contains a complete description of the syntax of SNOBOL4. Appendix B describes the 
differences between Versions 2 and 3 of SNOBOL4. Appendix C illustrates the use of 
SNOBOL4 in more complicated programs. Appendix D contains solutions to the exercises 
given at the ends of the chapters. 


Exercises 

Problem 1.1: Write a program that reads in data cards, prints all cards read in, and 
punches out a copy of all cards that do not contain an ampersand. 

Problem 1.2: Write a program that deletes the first eight characters from each 80- 
character input card and prints the remainder of the card. 

Problem 1.3: Compute and print out the first ten powers of e and their sum. 

Problem 1.4: Write a program that encodes messages by some simple character 
replacement (for example, replace A by Z, B by Y, etc.). Include punctuation in the 
encoding. 

Problem 1.5: Write a program that reads a deck of cards into an array. Let the first card 
indicate the number of cards that follow. Print out the cards in reverse order. 

Problem 1.6: Hexadecimal digits 0,1,...,9,A,B,C,D,E,F have the decimal equivalents of 
0,1,..., 15. Using a table to associate hexadecimal digits with corresponding decimal 
equivalents, define a function that converts a string of hexadecimal characters to its 
decimal equivalent. 

Problem 1.7: Let a binary tree be a data structure composed of nodes that contain three 
fields: value, left son, right son. A tree has a distinguished node, the root, that is not the 
left son or right son of any node. 

a. Define a data type BNODE with fields appropriate for a binary tree. 

b. Generate a binary tree with three nodes such that the value of the root node is +, 
the value of the left son is X, and the value of the right son is Y. 

c. Define a function LLIST that prints the values of the nodes of a binary tree in 
the following order. 
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1. value of the root 

2. value of the nodes in the left subtree 

3. value of the nodes in the right subtree 




CHAPTER 2 


Pattern Matching 


2.1 Introduction 

Strings of characters can be synthesized from smaller strings by concatenation. The 
converse of synthesis, decomposition of strings into substrings, is performed using pattern 
matching. Fundamentally, pattern matching is the process of examining a subject string for 
a substring which is one of a set specified by a pattern. The substring and parts thereof, 
formed by pattern matching, can be assigned as the values of variables. 

There are two types of statements in which pattern matching can occur: the pattern 
matching statement and the replacement statement. These statements have the respective 
forms 

label subject pattern goto 

label subject pattern = object goto 

The pattern and object may be expressions, as illustrated by 

LAB1 TEXT A | B :S(LAB2)F(LAB3) 

LAB4 STR CD = X '3' :S(LABS)P(LABS) 


Before matching actually occurs, the expression in the pattern field is evaluated. Its 
value can be a string. It also can be a pattern structure, which may be thought of as a 
set of strings. The string or pattern structure is used to drive a pattern matching 
procedure (the scanner) which performs the actual matching. Should any string sf)ecified 
by the pattern field appear as a substring of the subject, pattern matching succeeds. 

The primary purpose of this chapter is to consider in detail those SNOBOL4 language 
features that programmers may use to write expressions that, when evaluated, yield pattern 
structures. These features include the pattern-building operations of concatenation and 
alternation, primitive pattern structures built into the system, primitive functions whose 
values are pattern structures, value assignment operations, and the unary operator ♦, 
which produces an unevaluated expression. 

Pattern structures representing sets of fixed strings such as those built by 

BASE = 'BINARY' | 'DECIMAL' | 'HEX' 

SCALE = 'FIXED' I 'FLOAT' 

ATTRIBUTE = SCALE BASE 


24 
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are basic to pattern matching. Additional language features provide natural ways to talk 
about more complicated sets of strings, such as: 

all strings of a specified length, 

all characters up to the first comma, 

the longest string of blanks, 

any number of repetitions of a string, 

any string balanced with respect to parentheses, and 

any string at all. 

For many users of SNOBOL4, a knowledge of how patterns are actually matched is of 
little importance. The success or failure of matching is all that matters. However, by 
understanding the scanning procedure, a programmer can write more efficient patterns and 
make use of features such as immediate value assignment and unevaluated expressions that 
can actually change a pattern during matching. Thus, the secondary purpose of this 
chapter is to describe how the scanner works. 


2.2 Alternation and Concatenation 

A brief introduction to the pattern building operations of alternation and concatenation 
appears in Chapter 1. There, alternation and concatenation are used to build pattern 
structures that match sets of strings. 

Alternation, indicated by the binary operator | , builds a single pattern structure from 
its two arguments. If P1 and P2 are strings or pattern structures, the statement 

P3 = PI I P2 

builds a new structure and assigns it as the value of P3. P3 matches any string matched 
by P 1 or P2 . 

No explicit operator is used to indicate concatenation. Concatenation is implied when 

two elements of an expression are separated by one or more blanks. If P4 and P5 are 

strings, the statement 

P6 = P4 P5 

assigns to P6 a string which is the value of P4 followed by the value of P5. If either P4 
or P5 is a pattern structure, the statement above builds a pattern structure and assigns it 
as the value of P6. P6 matches any string that can be formed from a string matched by 
P4 , followed by a string matched by P5. 

Alternation and concatenation can be used to build pattern structures that match large 
numbers of strings. Consider the following statements. 
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p = 

'BE' 1 

1 'BEA' 1 

1 'BEAR' 

Q = 

'RO' 

' ROO' 

'RODS' 

R = 

'DS' 1 

1 'D' 


S = 

'TS' 1 

1 ’ * 



PAT = P R I Q S 

Concatenation has higher precedence than alternation, so the structure for PAT is built as 
if 

PAT = (P R) I (Q S) 

had been written. PAT matches any of the twelve strings: 


BEDS 

ROTS 

BED 

ROT 

BEADS 

ROOTS 

BEAD 

ROOT 

BEARDS 

ROOSTS 

BEARD 

ROOST 


2.3 Scanning 

Matching a pattern structure against a subject string is done by a procedure called the 
scanner. The pattern structure behaves like a program that indicates to the scanner how 
to examine the subject string. 

At any instant during scanning, the scanner uses two pieces of information: 

(1) where in the subject string it should be looking, and 

(2) what component of the pattern structure it should match. 

The scanner has a pointer called the cursor which is positioned to the left of the character 
that the scanner must match. A second pointer called the needle points at the component 
to be matched. 

Consider the following example, in which the string of characters READS is matched by 
a pattern structure that is the value of BR. 

BR = ( ’ B ' I ' R ’ ) ( ' E ' I ’ EA' ) ( ’ D ' | ' DS ' ) 

'READS' BR 

For illustrative purposes, it is convenient to think of components of a pattern structure 
as a set of beads that the scanner is trying to thread using the needle. A bead diagram 
representing BR is shown below. 
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In bead diagrams, left-to-right order of concatenation is preserved. Alternation is 
represented top to bottom. The needle points at the bead that the scanner is currently 
trying to match. If a bead matches, the needle passes through and moves upward as far as 
it can go without crossing a horizontal line. If a bead does not match, the needle moves 
down to an alternative bead, provided one exists. Downward movement may not cross a 
horizontal line. If no alternative exists, the needle is pulled back through the last 
successfully matched bead, and an alternative is sought there. 

The following figure illustrates the steps in matching BR against READS . The arrow 
pointing at READS represents the cursor, while the arrow pointing at the beads represents 
the needle. Failure in the fifth step causes the needle to be pulled back. The cursor is 
moved back at the same time. 


READS 

t 






READS 

t 
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Bead diagrams graphically illustrate one important control that the programmer has 
over the scanner. In a pattern-valued expression such as 

BR = ( ' B ' I ' R ' ) ( ' E ' I ' EA' ) ( ' D ' | ' DS ' ) 
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alternatives are matched by the scanner in left-to-right order (top to bottom in the bead 
diagram). Thus, the scanner attempts to match B before R, E before EA, and D before DS. 
By positioning alternatives correctly a programmer can control the order in which the 
scanner looks at them. 

The bead diagram for the pattern structure PAT developed in the previous section 
follows. 








A successful match in the statement 
'ROOSTS' PAT 

requires eleven steps. 

2.4 Modes of Scanning 

Two keywords, SANCHOR and &FULLSCAN, give the programmer additional control 
over the scanner. The scanner operates in an unanchored or anchored mode, depending 
on the value of SANCHOR. When unanchored, a pattern can match anywhere in the 
subject string. When anchored, a pattern can match only beginning at the first character. 
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For efficiency, tests are made during scanning to prevent the scanner from looking at 
alternatives that cannot possibly succeed. SFULLSCAN can be used to turn these tests off, 
leading to complete, but possibly inefficient, pattern matching. Discussion of SFULLSCAN 
is deferred until the end of this chapter, since it is useful only with more sophisticated 
patterns. 


2.4.1 Unanchored Mode 

The keyword S ANCHOR initially has the value zero, signifying the unanchored mode of 
scanning. The scanner may look anywhere in the subject string for an appropriate 
substring. Consider the following example. 

'A BIG BOY' 'BIG' \ 'LITTLE' 

Pattern matching succeeds. The steps involved are shown below using a bead diagram. 


A BIG BOY 

t 




LITTL^ 


A BIG BOY 

t 



W'little' 


A B I G BOY 

t 



'LITTLE' 
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A B I BOY 

t 


The cursor is initially at the left of the subject string. When all possible alternatives fail, 
the cursor is moved one character to the right. All possible alternatives are tried with the 
cursor beginning in the new position. Again, all alternatives fail. The cursor is moved 
again and this time the first alternative succeeds. 

In the unanchored mode, the origin of pattern matching is moved by changing the 
initial position of the cursor. Thus, the scanner matches, if possible, a substring anywhere 
in the subject string. If more than one valid substring exists, the scanner finds the leftmost 
one. 



2.4.2 Anchored Mode 

Frequently it is necessary to know if a pattern matches when its origin is at the first 
character of the subject string. As an example, suppose a program is desired that reads 
any SNOBOL4 program and prints only those lines that are not comments (i.e. do not 
have ♦ in column 1). At first glance, the following statements might seem to suffice. 

BEGIN LINE = INPUT :F(END) 

LINE :S(BEGIN) 

OUTPUT = LINE :(BEGIN) 


END 
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This program does not work as intended because a card with ♦ appearing anywhere at all 
in it is rejected. 

If £ ANCHOR has a nonzero value obtained by executing a statement such as 
6ANCHOR = 1 

the pattern match is anchored at the left of the subject string. When the scanner is in the 
anchored mode, the initial position of the cursor is not moved. Thus the scanner, when 
anchored, only matches ♦ against the first character of LINE. 

The anchored mode of scanning is generally more efficient than the unanchored mode, 
since the scanner examines fewer possibilities. Anchored scanning should be used where 
possible. It is, of course, possible to switch modes during execution of a program simply 
by changing the value of £ ANCHOR. 


2.5 Value Assignment through Pattern Matching 

Pattern matching may be viewed as a means of decomposing a string into substrings. 
To be useful, a substring found by the scanner often must be assigned as the value of a 
variable. Consider the pattern BR used in an earlier section. 

BR = CB' I 'R') CE' I 'EA') CD’ | ’DS') 

Used in a pattern matching statement such as 

STR BR :S(L1)F(L2) 

where the subject string may be anything, success of matching indicates only that one of 
the valid strings appears somewhere in STR. It does not indicate which string matches or 
where it matches. On failure, no indication is given of how nearly successful the scanner 
was. There are two ways of assigning a substring found by the scanner to a variable: 
conditional value assignment and immediate value assignment. 


2.5.1 Conditional Value Assignment 

The binary operator . is used to indicate conditional value assignment. The value of 
the expression 

P . V 

is a pattern in which V is associated with the pattern P . This pattern is the same as P , 
except that upon successful completion of pattern matching, the substring matched by P is 
assigned as the value of the variable V. Thus, by associating several variables with 
portions of a pattern, it is possible to ascertain what the overall pattern matches, and also 
which components of the pattern are used in the match. For example, rewriting BR as 

BR = (('B' I 'R') CE’ I 'EA') CD' | ' DS' ) ) . BRVAL 
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associates the variable BRVAL with the entire pattern. On successful completion of 
matching, the entire substring matched is assigned as value of BRVAL . Rewriting still 
further, variables can be associated with pieces of the pattern. 

BR = (('B' I 'R') . FIRST ('E' | 'EA') . SECOND 

+ CD' 1 'DS') . THIRD) . BRVAL 

A successful match causes the entire substring to be assigned as the value of BRVAL. B or 
R becomes the value of FIRST, E or EA becomes the value of SECOND, and D or DS 
becomes the value of THIRD. Failure to match leaves the values of all variables 
unchanged. 


2.5.2 Immediate Value Assignment 

The binary operator $ signifies immediate value assignment. The expression 
P $ V 

associates a variable V with a pattern P so that whenever P matches a substring, the 
substring immediately becomes the new value of V . It is possible, by using $, to associate 
variables with parts of a large pattern, to see how far scanning progressed in the event of 
failure. Value assignment is done for those parts of the pattern that match, even if the 
overall match fails. Suppose BR is rewritten using $ instead of . where shown. 

BR = (CB* 1 'R') $ FIRST ('E' 1 'EA') $ SECOND 

+ ('D' I 'DS') $ THIRD) . BRVAL 

In the following statement, pattern matching fails. 

'BEATS' BR :S(L1)F(L2) 

However, since immediate assignment is performed whenever the associated part of the 
pattern matches, the following assignments are made. 

FIRST = 'B' 

SECOND = 'E' 

SECOND = 'EA' 

Values of THIRD and BRVAL are unchanged. If conditional assignment is used, values of 
all four variables are unchanged. In the following example, the pattern matches. 

'BREADS' BR :S(L1)F(L2) 

Values assigned both during and after scanning are: 

FIRST = 'B' 

FIRST = 'R' 

SECOND = ' E' 

SECOND = 'EA' 

THIRD = 'D' 

BRVAL = 'READ' 
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The outcome is the same as if conditional value assignment had been used. Immediate 
value assignment is less efficient in this case because two redundant assignments are made. 
As a general rule, conditional value assignment should be used whenever possible. 
Immediate value assignment should be used only in those cases where intermediate results 
are important. 


2.5.3 Precedence 

The operators . and $ have the same precedence and associate to the left. They have 
higher precedence than alternation and concatenation. Thus, in the statement 

BR = (('B' I 'R') $ FIRST ('E* | 'EA') $ SECOND 

+ ('D* I 'DS') $ THIRD) . BRVAL 

the outer parentheses are required to associate BRVAL with the entire pattern, while 
additional parentheses are not required to associate FIRST, SECOND, and THIRD. 

Z5.4 Association with the Variable OUTPUT 

Since OUTPUT is a variable, it may be associated with any portion of a pattern. A 
successful match involving the pattern 

('BED' I 'BUG' I 'BOMB') . OUTPUT 

causes the successful alternative to be printed. Using $ to associate OUTPUT with several 
parts of a pattern achieves the effect of tracing the progress of the scanner. By 
constructing BR as 

BR = (' B ' I ' R ') $ OUTPUT ('E' | 'EA') $ OUTPUT 

+ ('D' I 'DS') $ OUTPUT 

the output resulting from execution of the statement 

is 

R 
E 

EA 
D 


' READS' BR 


:S(L1)F(L2) 
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Z5.5 Value Assignment in Replacement Statements 

Value assignment is a necessity in some kinds of replacement statements. In the 
following replacement statement E or EA is replaced with I only if the overall pattern BR 
matches. In effect, the replacement statement changes BED and BEAD into BID, BEDS and 
BEADS into BIDS , etc. 

BR = CB' I 'R') . FIRST CE' | 'EA') CD' | 'DS') . LAST 

STR BR = FIRST '!' LAST 

The replacement statement works properly because conditional assignment is done after 
pattern matching, but before the object expression is evaluated. 


2.5.6 Association of Several Variables with One Pattern 

Earlier examples illustrated how variable association may be nested. It is also possible to 
associate more than one variable with a single pattern structure. The statement 

PAT = PI $ VI . V2 

builds a pattern structure where variables V1 and V2 are both associated with the pattern 
PI, VI as immediate assignment and V2 as conditional assignment. Changing the order 
of association to 

PAT = P1 . V2 $ VI 

has no effect on the value assignment. If PAT is involved in a successful pattern match, 
VI and V2 are assigned the same value. If the pattern match fails, the value of VI might 
be changed but the value ofV2 is not. 


2.6 The Null String in Pattern Matching 

The null string is the string of zero length. Attempts by the scanner to match the null 
string always succeed. The variable NULL has the null string as its initial value and, by 
convention, is used in this chapter to represent the null pattern that matches a string of 
zero length. Pattern matching in the statement 

STR NULL :S(ON)F(ERROR) 

always succeeds, even if STR itself has the null string as value. 

The variable NULL is frequently used in more complex patterns. For example, a pattern 
that matches the eight strings 

C BC 

D BD 

AC ABC 

AD ABD 
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can be written as 

(NULL I 'A') (NULL | 'B') ('C | 'D') 

Matching a pattern of the form 
NULL $ X $ Y PAT 

sets the values of X and Y to the null string before matching of PAT begins. 

A number of patterns described in this chapter match the null string. Where bead 
diagram representations of the patterns are given, NULL is used to indicate the null string. 


2.7 Cursor Position 

The unary operator a is called the cursor position operator. Its operand is a variable. 
The value of aX is a pattern structure that matches the null string and assigns the current 
cursor position as an integer value of the variable X. Assignment of the cursor postion to 
the operand of the a operator takes place as immediate value assignment. Value is 
assigned when the cursor position operator is encountered during pattern matching, not 
following successful completion. 

Execution of the following statements assigns the integers 0, 1,2, 3, 4, and finally 5 to 
the variable HEAD. 

6ANCH0R = 0 

'TEST AT OPERATOR' aHEAD 'AT' 

Pattern matching succeeds when the cursor is initially positioned to the left of the AT. 
The cursor position at this point is 5, the final value assigned to HEAD. 

The following program reads a collection of input strings one at a time, prints them, 
matches for a pattern, and underlines the substrings matched. The cursor position operator 
is used to locate the beginning and end of the matched substring. From the cursor 
positions a second line is constructed, consisting of dashes in the position of the matched 
substring. 


6TRIM = 1 

P = ('B' I 'R') ('E' 

PATTERN = ax P aY 
LOOP OUTPUT = INPUT 
OUTPUT PATTERN 
OUTPUT = DUPL( 

BLLINE OUTPUT 
NOPAT OUTPUT 
END 


I ' EA' ) ( ' D' I ' DS ' ) 

:F(END) 

:F(NOPAT) 

DUPL('-',Y - X) 

(LOOP) 
(BLLINE) 


X) 

'P FAILED TO MATCH 
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For the input data 

THE BEADS ARE RED. 
BRED AND BORED. 

BEAUTY AND THE BEAST. 

the program output is 

THE BEADS ARE RED. 


BRED AND BORED. 


BEAUTY AND THE BEAST. 
P FAILED TO MATCH. 


2.8 LEN 


LEN (integer) is a primitive function whose value is a pattern structure that matches 
any string of the specified length. The argument of LEN must have a nonnegative integer 
value when pattern matching is performed. In the following example, pattern matching 
succeeds only if the subject STR has in it somewhere an open parenthesis separated from a 
closed parenthesis by exactly five characters. 

STR '(' LEN(5) ')' :StLl)F(L2) 


LEN can be used to break out fixed-length fields from strings. In the following example 
dates from data cards such as 


1290 SEP. 27 CHINA, CHIHLI 100,000 

1293 MAY 20 JAPAN, KAMARKURA 30,000 

1531 JAN. 26 PORTUGAL, LISBON 30,000 

are reformatted as 


SEP. 

27, 

1290 

CHINA, 

CHIHLI 

100,000 

MAY 

20, 

1293 

JAPAN, 

KAMARKURA 

30,000 

JAN. 

26, 

1531 

PORTUGAL, LISBON 

30,000 


A program that performs this transformation is 
6ANCHOR = 1 

DATE = LEN(4) . YR ' ' LEN(4) .MO ' ' 
LOOP CARD = INPUT 

CARD DATE = MO ’ ' DAY ’, ' YR ' 
OUTPUT = CARD 

NOGOOD OUTPUT = CARD ' IMPROPERLY FORMATTED.' 
END 


LEN(2) . DAY 
:F(END) 

:F(NOGOOD) 
:(LOOP) 

;(LOOP) 
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LEN is used to match the various pieces of the data, assigning the strings found to the 
variables YR , MO , and DAY. YR , MO , and DAY are assigned values after pattern matching 
but before the entire substring matched by DATE is replaced. Only the date portion of 
CARD is reformatted. 


2.9 SPAN and BREAK 

SPAN and BREAK are primitive functions whose values are pattern structures that match 
runs of characters. Patterns described by 

a run of blanks, 
a string of digits, and 
a word (run of letters) 

can be formed using SPAN as 

SPAN(' ' ) 

SPAN('0123456789') 

SPAN('ABCDEPGHIJKLMNOPQRSTUVWXYZ') 

Patterns described by 

everything up to the next blank, 

everything up to the next punctuation mark, and 

everything up to the next number, 

can be formed using BREAK as 

break!' ' ) 
break! 

break!'+-0123456789') 

Arguments of BREAK and SPAN must be nonnull strings when pattern matching is 
performed. 

The pattern structure for SPAN matches the longest string beginning at the cursor that 
consists solely of characters appearing in the argument. SPAN may be thought of as 
streaming from the cursor until a character not included in the argument is found. SPAN 
must match at least one character, or it fails. 

BREAK generates a pattern structure that matches the longest string beginning at the 
cursor that does not contain a character of the argument. Thus, regarding its argument as 
a list of ‘break’ characters, BREAK streams from the cursor up to, but not including, the 
first break character. BREAK must find a break character, or it fails. If the cursor is 
positioned immediately to the left of a break character, BREAK matches the null string. 

A bead diagram for the statement 

'IT RUNS.' break!' ') span!' ') BREAK!'.') 
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illustrates how the cursor is moved by SPAN and BREAK. 


t 


IT RUNS. 

t 


IT RUNS. 

t 





0 


IT RUNS^. 

t 


IT RUNS., 


break!' ')) -(span!' ')^ -{ BREAK !'.')^-K'.' 



€ 


The next program illustrates the use of both BREAK and SPAN. It compresses tabulated 
data, leaving fields separated by single colons rather than an arbitrary number of blanks. 
For example, if the input is 


ACTINIUM 

AC 

89 

227* 

1899 

DEBIERNE 

ALUMINUM 

AL 

13 

26.9815 

1825 

OERSTED 

AMERICIUM 

AM 

95 

243* 

1944 

SEABORG 

ANTIMONY 

SB 

51 

121.75 

1450 

VALENTINE 


the output is 

ACTINIUM:AC:89:227*:1899:DEBIERNE 
ALUMINUM:AL:13:26.9815:1825:OERSTED 
AMERICIUM:AM:95:243«:1944:SEABORG 
ANTIMONY:SB:51:121.75:1450:VALENTINE 

A program that performs this transformation is 

SANCHOR = 1; 6TRIM = 1 

FIELD = break!' ') . CHARS SPAN!' ') 

LOOP CARD = INPUT 

INLOOP CARD FIELD = CHARS ':' 

PUNCH = CARD 


F!END) 

S!INLOOP) 
!LOOP) 


END 
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Each input card is repeatedly examined for a run of blanks, and the blanks are replaced 
by a colon. When blanks no longer exist, the compression is complete and a new card is 
punched. 

Some care must be exercised in using BREAK, since it does not match the break 
character that stops the streaming. Suppose a program is wanted which restores, to some 
degree, the compressed data generated above. Each field of the compressed data can be 
broken out using a statement such as 

CARD break!':') . FLD ’:’ = 

Since BREAK! ' : ' ) does not ‘consume’ the colon, the literal is included to remove the 
break character. 

SPAN never matches a string shorter than the maximum span. For example, 

'9824761.' span!'0123456789') '6' 

cannot succeed since SPAN always matches up to the decimal point. 

In the event that components of the pattern beyond BREAK fail, BREAK does not skip 
over the break character to continue streaming. In the anchored mode the following 
statement never succeeds. 

'123,427,642.00' BREAK!'.,') LEN!1) 'O' 

BREAK! ' . , ' ) matches 123 and that is all. Similarly, SPAN does not skip over a matched 
span of characters if components of the pattern beyond SPAN fail. 


2.10 ANY andNOTANY 

Any! string) and NOTANY!string) are primitive functions whose values are pattern 
structures that match single characters. ANY matches any character appearing in its 
argument. NOTANY matches any character not appearing in its argument. Thus, the 
pattern structure for ANY!'AEI0U') matches any vowel. The pattern for 
NOTANY! 'AEIOU' ) matches any character that is not a vowel. Arguments of ANY and 
NOTANY must be nonnull strings when pattern matching is performed. 

ANY and NOTANY are fast ways of looking for one of a set of single characters. For 
example, 

any!'AEIOU') 
is preferable to 

'A' I 'E' I 'I' I 'O' I 'U' 

The call 


NOTANY!'STRUCTURE') 
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is valid even though the characters T , R , and U appear twice. The order of characters is 
irrelevant. NOTANY ( ' CERSTU ' ) is equivalent to NOTANY ( ' STRUCTURE ' ) . 

A complete syntactic recognizer for SNOBOL4 statements is included in Appendix C. 
ANY is used in patterns to recognize several of the syntactic elements. 

Identifiers begin with a letter which may be followed by any number of letters, digits, 
periods, and underscores. A binary operator is one or more blanks (for concatenation) or a 
binary symbol surrounded by blanks. A label begins with a letter or digit which may be 
followed by anything up to a blank or semicolon. 

If the following assignments are made 

OPSYMS = '+-*/.$£a I ?-,#Xl ' 

LETTERS = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ' 

DIGITS = '0123456789' 

ALPHANUMERICS = LETTERS DIGITS 

patterns to recognize identifiers, binary symbols, binary operators, and labels are 

IDENTIFIER = ANY(LETTERS) (NULL | SPAN(ALPHANUMERICS 
BINARYSYM = ANY(OPSYMS) | '♦♦' 

BINARYOP = SPAN(' ') (NULL | BINARYSYM SPAN(' ')) 

LABEL = ANY(ALPHANUMERICS) BREAK('; ') 


2.11 TAB, RTAB, and REM 

TAB (integer) and RTAB ( integer ) are primitive functions whose values are pattern 
structures that match all characters from the current cursor position up to a specific point 
in the subject string. TAB (N ) matches up through the N th character of the subject string. 
RTAB(N) matches up to but not including the Nth character from the right end of the 
subject string. Stated another way, TAB (N ) insures that N characters are matched by 
positioning the cursor to the right of the N th character. RTAB (N ) insures that all but N 
characters are matched by positioning the cursor to the left of the N th character from the 
end. For example, in the statement 

'SNOBOL4' LEN(2) TAB(6) 

the pattern matches the substring SNOBOL with TAB ( 6 ) matching OBOL . In a similar 
statement, 

'SNOBOL4' LEN(2) RTAB(1) 

the substring SNOBOL is once again matched with RTAB ( 1 ) matching OBOL. 

RTAB (0) is particularly useful for matching everything to the end of the subject string. 
For convenience, the variable REM has as its initial value the pattern structure for 
RTAB (0 ) . Thus, the pattern 


LASTS 


RTAB(8) REM . L8 
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matches the entire subject and assigns the last eight characters as the value of L8. 

TAB and RTAB require integer arguments when pattern matching is performed. If the 
argument of TAB or RTAB is negative, a program error occurs. An argument that would 
require moving the cursor left causes failure. The statement 

STR LEN(5) TAB(4) 

fails because the cursor cannot be moved back by TAB (4) . 

TAB and RTAB are particularly valuable in breaking fields out of structured data. The 
following data is part of the 1964 list of congressmen from New Jersey. 

Column 4 Column 30 Column 36 

I i 

1 WILLIAM T. CAHILL REP COLLINGSWOOD 

2 THOMAS C. MCGRATH, JR. DEM MARGATE CITY 

3 JAMES J. HOWARD DEM WALL 


14 DOMINICK V. DANIELS DEM JERSEY CITY 

15 EDWARD J. PATTEN DEM PERTH AMBOY 

Suppose a new deck of cards is desired, listing only the names left justified at column 1, 
and the post office address right justified at column 44. The following program reads the 
cards, breaks out the NAME and PO fields, formats and punches a new deck. 


LOOP 


END 


6ANCHOR = 1 f 6TRIM = 1 

NAMEANDPO = TAB(3) (TAB(29) . NAME) TAB(35) (REM . PO) 
CARD = INPUT ;F(END) 

CARD NAMEANDPO :F(ERROR) 

NAME = TRIM(NAME) 

OUTPUT = NAME DUPL(' ',44 - (SIZE(NAME) + SIZE(PO))) PO 
PUNCH = OUTPUT :(LOOP) 


Fields are broken out of the input cards using the pattern NAMEANDPO . The NAME field 
has trailing blanks that are trimmed before the output line is formatted. The post office 
address is obtained using REM and does not have trailing blanks since the input card was 
initially trimmed. DUPL is used to insert the proper number of padding blanks between 
NAME and PO on output. Output from the program is 

WILLIAM T. CAHILL COLLINGSWOOD 

THOMAS C. MCGRATH, JR. MARGATE CITY 

JAMES J. HOWARD WALL 


DOMINICK V. DANIELS 
EDWARD J. PATTEN 


JERSEY CITY 
PERTH AMBOY 
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A bead diagram illustrating the match of NAMEANDPO and the first data card is shown 
below. 
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2.12 POS andRPOS 

POS ( integer) and RPOS ( integer) are primitive functions whose values are pattern 
structures. These pattern structures match the null string if the cursor is at a point in the 
subject string specified by the integer argument. POS(N) succeeds, matching the null 
string, only if the cursor is positioned just at the right of the Nth character. RPOS(N) 
succeeds, matching the null string, only if the cursor is positioned just at the left of the 
Nth character from the end of the subject string. POS and RPOS never cause the cursor to 
be moved; they test its position. For example, in the statements 

6ANCHOR = 1 

STR SPANC ') POS(7) 

pattern matching succeeds only if the first seven characters are blanks and the eighth is 
not a blank. In the following example, 

EANCHOR = 1 

STR SPANC ') RPOS(7) 

pattern matching succeeds only if the seventh character from the end of STR is nonblank 
and everything preceeding it is blank. 

POS (0) is a pattern that succeeds only if the cursor is at the left of the subject string. 
RPOS (0) succeeds only if the cursor is at the right of the subject string. POS (0) and 
RPOS ( 0 ) can serve as left and right anchors for any pattern P, as in 

ENTIRE = POS(O) P RPOS(O) 

In the statement 

STR ENTIRE 

pattern matching succeeds only if P matches all of STR. If, at the time ENTIRE is built, P 
has the value 

'CAR' I 'CART' I 'CARTE' 

Matching in the statement 

'CARTE' ENTIRE 
is illustrated by the bead diagram: 


.CARTE 

t 



' CARTE' 





























46 Pattern Matching 


2.12 



Arguments for POS and RPOS must have nonnegative integer values when pattern 
matching is performed. Negative or noninteger arguments cause a program error. 

The following program uses POS, RPOS, SPAN, and BREAK to list cards that do not 
conform to a specified format. Cards, to be valid, must have four columns of data right 
justified at columns 10, 20, 30, and 40. Data in any field must contain no more than nine 
nonblank characters. 

SPAN and BREAK are used to locate fields on a card while POS and RPOS verify the 
location of the fields. A number indicating the position of an invalid card in the input 
deck is concatenated onto the front of an invalid card before printing. 

£ANCHOR = 1 

FIELD = SPAN(' ') break!' ') 

FORMAT = POS(O) FIELD POS(10) FIELD POS(20) FIELD 
+ POS(30) FIELD POS(40) SPAN(' ') RPOS(O) 

= 'CARDS WITH IMPROPER FORMAT ARE;' 


OUTPUT 
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OUTPUT = 



LOOP 

N = N + 1 




CARD = INPUT 


:P(END) 


CARD PORMAT 


:S(LOOP) 


OUTPUT = '#' N DUPL(' 

',3 - SIZE(N)) CARD 

:(LOOP) 

END 





Input supplied to the program for test purposes is 


XXXXXXXXX 

xxxxxxx 

xxxxx 

X 


xxxxxxx 

xxxxx 

X 

XXXXXXXXX 

xxxxxxx 

xxxxx 

X 

XXXXXXXXX 


xxxxx 

X 

XXXXXXXXX 

xxxxxxx 

xxxxx 

X 

XXXXXXXXX 

xxxxxxx 

xxxxx 


XXXXXXXXX 

xxxxxxx 

xxxxx 

X 

XXXXXXXXX 

xxxxxxx 

xxxxx 

XX 

XXXXXXXXX 

xxxxxxx 

xxxxx 

X 

xxxxxxxxxx 

xxxxxxx 

xxxxx 

X 

XXXXXXXXX 

xxxxxxx 

xxxxx 

X 

XXXXXXXXX 

xxxxxxx 

xxxxx 

X 

Output from the 

program is 




#2 


xxxxxxx 

xxxxx 

X 

#4 

XXXXXXXXX 


xxxxx 

X 

#6 

XXXXXXXXX 

xxxxxxx 

xxxxx 


#8 

XXXXXXXXX 

xxxxxxx 

xxxxx 

XX 

#10 

xxxxxxxxxx 

xxxxxxx 

xxxxx 

X 

#12 

XXXXXXXXX 

xxxxxxx 

xxxxx 

X 


2.13 PAIL 


PAIL is a variable whose initial value is a primitive pattern that always fails to match. 
PAIL causes the scanner to seek alternatives. 

Consider the following statements. 

6ANCH0R = 0 

'MISSISSIPPI' ('IS' I 'SI' I 'IP' I 'PI') $ OUTPUT PAIL 

Normally, the pattern would match the first IS, print it, and terminate successfully. 
However, PAIL causes the scanner to back up after printing the IS to look for another 
alternative. SI is found and printed, and again PAIL causes the scanner to back up. 
Thus, PAIL causes the scanner to find and print all six substrings of MISSISSIPPI that 
the pattern 

('IS' I 'SI' I 'IP' I 'PI') 
matches before terminating in failure. 
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In general, the behavior of the scanner during any pattern match may be observed 
using a statement of the form 

STR PAT $ OUTPUT FAIL 

FAIL is generally used when a programmer wishes to force the scanner to try a number 
of alternatives even though some may succeed. In the following example, strings are read 
from cards and printed only if 

(1) the string contains at least one A, 

(2) the string contains at least one B, and 

(3) the string does not contain a C. 

The behavior is similar to the logical AND. 

SANCHOR = 1 ; STRIM = 1 

OUTPUT = 'ACCEPTABLE WORDS ARE;' 

OUTPUT = 

PAT = (break!'A') $ A I break!'B') $ B | 

+ break!'C') $ C) FAIL 

LOOP STRING = ' ' INPUT 

A = 

B = 

C = 

STRING PAT 

!differ!a) differ!b) ident!c)) 

OUTPUT = STRING 

END 

For the input strings 

ALPHA 
BETA 

ABRACADABRA 
ABSOLUTELY 
AWFUL 
ALBEIT 

the output is 

ACCEPTABLE WORDS ARE; 

BETA 

ABSOLUTELY 
ALBEIT 

PAT uses BREAK to look for the characters A, B, and C. If one is found, the 
corresponding variable is assigned a nonnull value. FAIL forces the scanner to check all 
three alternatives of PAT , whether they succeed or not. Thus, if the string contains an A 
and B but not C , the values of the variables A and B will be nonnull and the value of C 
will be null. DIPPER and IDENT are used to check the values of A, B, and C. 


; F!END) 


; F!LOOP) 
; ! LOOP) 
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2.14 FENCE 

The variable PENCE has a primitive pattern structure as its initial value. FENCE 
matches the null string when encountered by the scanner moving left to right through a 
pattern. If a subsequent failure causes the scanner to back up to FENCE , seeking an 
alternative, the pattern match is terminated. Considering FENCE as a bead, the needle 
passes freely from left to right. Attempting to pull the needle back through FENCE causes 
failure of pattern matching. 

Consider the following statements: 

6ANCH0R = 1 

'BERATES' ('BE' | 'GE' | 'PRE') ('TS' | 'T') 

BE matches, and both TS and T fail. At this point the scanner backs up and tries GE and 
FRE , both of which fail. Looking at the pattern, it is obvious that GE and FRE should not 
be tried because the first two characters are known to be BE. 

Inserting FENCE between the groups of alternatives eliminates the problem. 

'BERATES' ('BE' | 'GE' | 'PRE') FENCE ('TS' | 'T') 

Now, if BE matches, FENCE keeps the scanner from needlessly backing up to look at GE 
and FRE. 

FENCE can be used to temporarily anchor the scanner in a program that otherwise 
operates in the unanchored mode. Inserting FENCE before PAT in the statement 

STR (FENCE PAT) 

causes pattern matching to fail if PAT does not match beginning with the first character 
of STR. 


2.15 ABORT 

ABORT is a variable whose initial value is a primitive pattern structure that causes 
immediate failure of the entire pattern match. No alternatives are tried, and the statement 
fails. 

ABORT is useful in constructing conditional pattern matching statements. For instance, in 
processing SNOBOL4 source decks as data, the following pattern ignores comment cards, 
but matches all others against the pattern CARD. 

CARDFORM = '♦' ABORT | CARD 

Similarly, the pattern 

SHORTPAT = LEN(12) ABORT j PAT 

permits an attempt to match PAT only if the subject string is less than 12 characters long. 
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In general, a pattern described by a statement of the form, ‘has characteristics of P but 
not Q,’ can be implemented by 

PNOTQ = Q ABORT | P 


2.16 Unevaluated Expressions 


The unary operator ♦ postpones the evaluation of its operand. If E is an expression, 
then ♦E is an unevaluated expression. The unevaluated expression is evaluated when 

(1) the scanner encounters ♦E as part of a pattern structure, or 

(2) ♦E is used as the argument of the primitive function EVAL. 

In this chapter, unevaluated expressions, often simply called expressions, are considered 
only in the context of pattern matching. Chapter 3 describes EVAL in more detail. 

If an unevaluated expression appears as part of a pattern, the expression is evaluated 
when encountered during pattern matching. If evaluation of the expression is successful, 
the value becomes part of the pattern structure and pattern matching continues. If 
evaluation of the expression fails, the scanner backs up, seeking alternatives. Failure during 
evaluation of an expression does not necessarily cause failure of the entire pattern match. 

A typical use for unevaluated expressions is motivated by the following example. Two 
strings are read as input data and a list is made of the words appearing in both strings. 
The list is generated by obtaining words one at a time from the first string using the 
pattern WORD and using the pattern 

' ' W ANY(' . , ' ) 


to determine if each word appears in the second string. 


£ANCHOR 
WORD = 
STRING1 
STRING2 
LOOP STRING1 

STRING2 
LIST = 
OUTPUT OUTPUT 
END 


= 0; 6TRIM 

BREAK(' .,') . 

= INPUT ' ' 

= ' ' INPUT 

WORD = 

' ' W ANY{' 

LIST W ' , ' 

= LIST 


= 1 
W SPAN(' 






For the input strings 

THESE TWO STRINGS ARE ALMOST ALIKE. 
THE TWO STRINGS AREN'T ALIKE. 


F(ERROR) 

P(ERROR) 
F(OUTPUT) 
F(LOOP) 

(LOOP) 


the printed output is 
TWO, STRINGS, ALIKE, 
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As programmed above, a pattern structure for 

' ' W ANY(' .,') 


must be built during each pass through the loop because of the structure for ANY. 
Constructing the pattern outside of the loop is not appropriate either, since the value of W 
changes for each iteration of the loop. Using an unevaluated expression in place of the 
variable W does permit the pattern structure to be constructed outside of the loop. As 
illustrated below, the pattern structure for FINDW contains ♦W in place of W. The 
expression ♦W is not evaluated until needed in pattern matching. The value of W used 
during pattern matching is the current value, in this case the value just assigned by 
matching WORD. 


6ANCH0R = 0; STRIM = 1 

WORD = break!' .,') . w span!' 

FINDW = ' ' ♦W ANY!' .,') 

STRING1 = INPUT ' ' 

STRING2 = ' ' INPUT ' ' 

LOOP STRING1 WORD = 

STRING2 FINDW 

LIST = LIST W ' , ' 

OUTPUT OUTPUT = LIST 
END 




F!ERROR) 

F!ERROR) 
F!OUTPUT) 
F!LOOP) 

!LOOP) 


Unevaluated expressions are valid arguments for primitive pattern-valued functions. A 
pattern structure for the function is built, but the argument remains unevaluated until 
pattern matching is performed. 


The following program is similar to the one above, except that characters common to 
two strings are sought rather than words. Each character of the first string is obtained 
using the pattern CHAR. The pattern FINDCH uses the unevaluated expression ♦CH as the 
argument of BREAK. When FINDCH is used in pattern matching, the current value of CH, 
that just obtained by matching CHAR , becomes the break character. 


SANCHOR = 1; STRIM 

CHAR = LEN!1) . CH 
FINDCH = BREAK!*CH) 
STRING1 = INPUT 

STRING2 = INPUT 
LOOP STRING! CHAR = 
STRING2 FINDCH 
LIST = LIST CH 
OUTPUT OUTPUT = LIST 
END 


F!ERROR) 

F!ERROR) 

F!OUTPUT) 
F!LOOP) 

!LOOP) 


For the input strings 

TWO STRINGS FOR TESTING 
ABCDEFGHIJKLMNOPQRSTUVWXYZ 
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the output string is 
TWOSTRINGSPORTESTING 


In pattern matching, unevaluated expressions can be used in a variety of ways as 
illustrated by the following examples. 


2.16.1 Example 1 

PAIR is a pattern that matches any two consecutive identical characters. PAIR uses 
LEN ( 1 ) to match any character, and immediate value assignment to assign the character 
as value of X . The expression ♦X is then evaluated and must match the same character as 
LEN( 1) . 


PAIR = (LEN(1) $ X ♦X) . OUTPUT 

'COOK' PAIR 
'COMMON' PAIR 
'AARON' PAIR 
'CHICKADEE' PAIR 

END 

Output from the program is: 

00 

MM 

AA 

EE 


2.16b2 Example 2 

Given any subject string STR and any pattern P, BIGP finds the longest substring of 
STR that P matches. 

BIGP = {♦P $ TRY ♦GT(SIZE(TRY),SIZE(BIG))) $ BIG FAIL 

BIGP uses two variables, BIG and TRY. During pattern matching, the value of BIG is 
the longest substring found. Before pattern matching, BIG must be initialized to the null 
string. TRY is assigned every substring that the pattern P matches. If TRY is longer than 
BIG, the value of BIG is updated. 

BIGP utilizes unevaluated expressions in two ways. ♦P allows BIGP to be constructed 
without specifying the value of P . The value of P is determined during pattern matching. 
The predicate ♦GT ( SIZE (TRY), SIZE (BIG ) ) is evaluated during pattern matching 
whenever ♦P matches a substring. It compares the size of TRY with the size of BIG. If the 
new substring is shorter, the predicate fails. Failure of a predicate or function during 
pattern matching causes the scanner to back up seeking alternatives. If the new substring 
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is longer, the predicate succeeds, returning the null string as value. This null string is 
immediately matched. The variable BIG is then assigned the new substring as value. 
FAIL causes the scanner to back up and look for another substring matched by P. 

The following is a test program for BIGP. 

BIGP = (^P $ TRY ♦GT(SIZE(TRY),SIZE(BIG))) $ BIG FAIL 

STR = 'IN 1964 NFL ATTENDANCE JUMPED TO 4,807,884? ’ 

+ 'AN INCREASE OF 401,810.' 

P = SPAN('0123456789, ’ ) 

BIG = 

STR BIGP 

OUTPUT = 'LARGEST NUMBER IS ' BIG 
P = SPAN('ABCDEFGHIJKLMNOPQRSTUVWXYZ') 

BIG = 

STR BIGP 

OUTPUT = 'LARGEST WORD IS ' BIG 

END 

The output is 

LARGEST NUMBER IS 4,807,884 
LARGEST WORD IS ATTENDANCE 


Z16.3 Example 3 

Recursive definitions of patterns are possible using unevaluated expressions. The pattern 
structure for 

P = P 'Z' I 'Y' 

is constructed using the previous value of P . If P was null, the new value of P matches 
the strings Y and Z . 

If the value of P is left unevaluated as in 
P = ♦P 'Z' I 'Y' 

the value of P at pattern matching time (which is ♦P ' Z' | 'Y' ) replaces ♦P, giving rise 
to a recursive definition. The pattern P matches either Y or anything matched by P 
followed by Z. Therefore, since P matches Y, it also matches YZ. Since P matches YZ , it 
also matches YZZ , etc. Thus, P matches strings of the form 

Y 

YZ 

YZZ 

YZZZ 
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A test program for the recursive definition of P follows. 

P = ♦P 'Z' I 'Y' 

PO = P . OUTPUT 
'Y' PO 
'YZZZ' PO 
'XYZ' PO 
'YZZX' PO 
'AYZZZZB' PO 

END 

Output from the program is 
Y 

YZZZ 

YZ 

YZZ 

YZZZZ 


2.16.4 Example 4 

Recursive definitions can be quite complicated, as in the following example, which 
recognizes a simple class of arithmetic expressions. 



£ANCHOR 

= 1 



VARIABLE 

= ANY('XYZ') 



ADDOP = 

ANY('+-') 



MULOP = 

ANY('♦/') 



FACTOR 

= VARIABLE 1 '(' ♦EXP ')' 



TERM = 

FACTOR 1 ♦TERM MULOP FACTOR 



EXP = 

ADDOP TERM | TERM | ♦EXP ADDOP 

TERM 

LOOP 

STRING 

= TRIM(INPUT) 

:F(END) 


STRING 

EXP RPOS(O) 

:F(NOGOOD) 


OUTPUT 

= STRING ' IS AN EXPRESSION.' 

;(LOOP) 

NOGOOD 

OUTPUT 

= STRING ' IS NOT AN EXPRESSION.' 

;(LOOP) 


END 

Output for typical data is 

X+Y*(Z+X) IS AN EXPRESSION. 
X-l-Y-l-Z IS AN EXPRESSION. 

XY IS NOT AN EXPRESSION. 
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2.16.5 Example 5 


A useful form of recursive pattern definition is one that seeks repetitions of a basic 
pattern. In the following example, input cards are read and leadering (consisting of 
alternating blanks and periods) is inserted for runs of three or more blanks. 

The pattern P matches an odd number of blanks by seeking repetitions of two blanks 
followed at the end by a single blank. PAT uses P to match an odd number of blanks that 
is at least three in length. The actual replacement of blanks by leadering takes place in 
the statement labelled LOOP 1 that is executed once for each span of blanks in a card. 


STRIM = 1 

p = ' ' ♦p I ' ' 

PAT = aY ' ' p ax 

LOOP CARD = INPUT 
OUTPUT = CARD 

LOOP 1 CARD PAT = DUPL(' .',(X-Y)/2) 

OUTPUT = CARD 

END 


: F (END) 

; S(LOOP 1) 
: (LOOP) 


For the input data 
INSERT 

TRY TWO BLANKS 
TRY THREE BLANKS 

TRY FOUR BLANKS 

TRY FIVE BLANKS 

TRY SEVERAL 

FINALLY, TRY 


LEADERING 


BREAKS 


A RIGHT JUSTIFIED ONE 


the output 

is 


INSERT 


LEADERING 

INSERT . 


. . LEADERING 

TRY 

TWO 

BLANKS 


TRY 

TWO 

BLANKS 


TRY 

THREE BLANKS 


TRY 

THREE . BLANKS 


TRY 

FOUR 

BLANKS 


TRY 

FOUR 

. BLANKS 


TRY 

FIVE 

BLANKS 


TRY 

FIVE 

. . BLANKS 


TRY 


SEVERAL 

BREAKS 

TRY 

• • 

. . SEVERAL . 

. . BREAKS 

FINALLY, 

TRY 


FINALLY, 

TRY 



A RIGHT JUSTIFIED ONE 
A RIGHT JUSTIFIED ONE 
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2.17 ARB 

ARB is a variable whose initial value is a primitive pattern structure that matches zero 
or more characters. When first encountered by the scanner moving from left to right, ARB 
matches the null string. When ‘backed into’ on subsequent occasions, ARB increases the 
size of the substring it matches by one. ARB fails only when it can no longer increase the 
length of the substring it matches. 

ARB can be used in the construction of patterns typified by the statement, ‘any string 
containing both CAT and DOG .’ Nothing is said about the order in which they appear or 
their separation. A suitable pattern is 

CATANDDOG = 'CAT' ARB 'DOG' | 'DOG' ARB 'CAT' 

Matching CATANDDOG against the strings 

CATALOG FOR SEADOGS 
DOGS HATE POLECATS 
CATDOG 

ARB matches the substrings 

ALOG FOR SEA 
S HATE POLE 

and the null string, respectively. 

The pattern structure for ARB has implicit alternatives. When ‘backed into’ because of 
failure, ARB attempts to find another suitable substring rather than fail. Only when all 
implicit alternatives have failed is the needle passed to an explicit alternative or back to a 
previously successful component. 

The following definition of ARB is equivalent to the initial definition given to ARB. 

ARB = NULL I LEN(1) >l<ARB 
A bead diagram for ARB is 



LEN( 1) 



If the bead for *ARB is replaced with the bead diagram for ARB , an expanded bead 
diagram for ARB becomes 





2.18 


ARBNO 


57 




It can be seen from the bead diagram that 

(1) the null string is matched on the first attempt, 

(2) subsequent attempts increase the substring matched by one character, and 

(3) failure occurs when the size of the substring cannot be increased. 

ARB , although natural, cannot be used with impunity. For example, it should not be 
used as the first component of a pattern unless associated with a variable for value 
assignment. The statement 

STR ARB PAT 

should be replaced by 

STR PAT 

which, when executed in the unanchored mode, behaves in exactly the same way, but is 
much faster. 

ARB should not be used to break fields out of a string if they are separated by known 
delimiters. For example, the statement 

STR break!',') . FIELD ',' = 

is faster than the statement 

STR ARB . FIELD ',' = 

although they accomplish the same thing. 
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2.18 ARBNO 

ARBNO is a mnemonic for ‘arbitrary number of.’ ARBNO (pattern ) is a primitive 
function whose value is a pattern structure that matches zero or more consecutive 
occurrences of strings matched by its argument. When encountered by the scanner in the 
forward direction, ARBNO (pattern) matches the null string. When ‘backed into,’ it tries 
to increase the length of the substring matched by its argument. In the statements 

6ANCHOR = 1 

SUBSTR ARBNO(LEN(3)) RPOS(0) 

the pattern match succeeds only if the length of SUBSTR is zero or a multiple of three. 

The value of ARBNO ( P ) is a pattern structure with implicit alternatives. It is equivalent 
to the pattern structure defined in the statement 

ARBNOP = NULL | P ♦ARBNOP 

A bead diagram having a form similar to that for ARB illustrates the implicit alternatives 
of ARBNOP or ARBNO (P) . 




In the following example the argument of ARBNO has several alternatives. 
SANCHOR = 1 

P = '1234' I '123' I '234' | '341' | '412' 

ARBNOTEST = ARBNO(P) $ OUTPUT RPOS(0) 

'123412341' ARBNOTEST 


END 
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The following bead diagram for ARBNOTEST illustrates how alternatives are handled. The 
output from the program above is a blank line (resulting from the null string), and then 

1234 

12341234 

1234123 

123 

123412 

123412341 



BREAK and SPAN can frequently be used in place of ARBNO. For example, 
ARBNO(' ') 

can usually be replaced by 
SPAN(' ’) 

or, if necessary, 

NULL I SPAN(' ') 

ARBNO is relatively slow and should be avoided if some other pattern will suffice. 
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2.19 BAL 

The initial value of the variable BAL is a primitive pattern structure that matches any 
nonnull string of characters which is balanced with respect to parentheses. BAL matches 

X 

XYZ 

(A+B) 

A(B*C)(E/F)G+H 

BAL does not match 

)A+B( 

((A+B) 

A pair of patterns that are equivalent to BAL are 

BALEXP = NOTANYC'O') | '(' ARBNO (♦BALEXP) ')' 

BAL = BALEXP ARBNO(BALEXP) 

The value of BALEXP matches a single balanced expression that consists of a single 
character that is neither an open nor closed parenthesis, or it matches an arbitrary 
number of balanced expressions (possibly none) enclosed in parentheses. BAL itself matches 
the concatenation of one or more single balanced expressions. 

Insight into the behavior of BAL can be gained from use of the following pattern: 

ALLBAL = BAL $ OUTPUT FAIL 

When used in the unanchored mode, a statement such as 

'((A+(B*C))+D)' ALLBAL 

prints out every balanced expression. The output for this case is 

((A+(B*C))+D) 

(A+(B*C)) 

(A+(B*C))+ 

(A+(B*C))+D 

A 

A+ 

A+(B*C) 

+ 

+(B*C) 

(B*C) 

B 

B* 

B*C 

+ 




2.20 


SUCCEED 


61 


♦C 

C 

+ 

+D 

D 


BAL facilitates the manipulation of algebraic and functional expressions. Programs using 
BAL to translate algebraic expressions from Polish to infix notation, and vice versa, appear 
in Chapter 4. 


220 SUCCEED 

The variable SUCCEED has a pattern structure as its initial value. SUCCEED matches 
the null string when first encountered by the scanner moving left to right through a 
pattern. If a subsequent failure causes the scanner to back up to SUCCEED, seeking an 
alternative, SUCCEED again matches the null string. Thus, SUCCEED always matches the 
null string, both in the forward direction and when alternatives are sought. SUCCEED has 
a bead representation where all implicit alternatives are the null string. 



Since the number of implied alternatives is infinite, the scanner can never back through 
SUCCEED. 

Practical uses for SUCCEED seem limited. However, the light-hearted programmer can 
use SUCCEED and FAIL to produce pattern matches that never terminate: 

= SUCCEED (LEN(1) ARB) $ OUTPUT FAIL 


SAWTOOTH 
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Since PAIL repeatedly causes the scanner to back up and retry ARB, the subpattern 

LEN(1) ARB 

matches first one character, then two, and so on up to the length of the subject string. 
Each substring matched by this subpattern is printed. Eventually ARB cannot match a 
longer string and fails, causing the scanner to back into SUCCEED. SUCCEED matches the 
null string and the entire process repeats itself. 

If the pattern SAWTOOTH is used in the statement 
’XXXXXX' SAWTOOTH 

pattern matching does not terminate, and the following output is produced. 

X 

XX 

XXX 

xxxx 

xxxxx 

XXXXXX 

X 

XX 

XXX 

xxxx 

xxxxx 

XXXXXX 

X 

XX 


SAWTOOTH can never succeed because of FAIL, and can never fail because of SUCCEED. 


2.21 Quickscan Mode 

In the quickscan mode, the scanner uses a number of heuristics to avoid looking at 
alternatives that cannot possibly lead to a successful match. Hence, the scanner operates 
on the assumption that the programmer is not interested in how matching is done, but 
only in the outcome. Typically, patterns concerned with how matching is done employ 
immediate value assignment or unevaluated expressions. Patterns that do not use these 
features can and should be used in the quickscan mode. Patterns using immediate value 
assignment and unevaluated expressions may produce unexpected results in the quickscan 
mode. This section describes the heuristics used by the scanner to speed up pattern 
matching. It points out where unexpected results may arise and what can be done about 
them. The keyword SFULLSCAN initially has a zero value, signifying the normal or 
quickscan mode of pattern matching. 
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This chapter, so far, has been concerned with the basic components of patterns. No 
consideration has been given to the context in which a component occurs. The basic 
notion of the quickscan mode is quite simple: Before a component or bead is matched, its 
context is examined to see if matching should be attempted, terminated, or an alternative 
sought. The easiest question to answer is whether the number of characters remaining in 
the subject string is sufficient to successfully complete a match. Consider the following 
example. 

BD = ('BE' I 'B') ('AR' | 'A') ('DS' | 'D') 

'BEAD' BD 

Four of the possible strings matched by BD are too long: BEARDS, BEARD, BEADS, and 
BARDS . The scanner should avoid them if possible. In the bead diagram which follows, a 
number is associated with each bead. The number represents the minimum number of 
characters necessary to match that bead and anything that follows. If the number is 
greater than the number of characters remaining in the subject string, the scanner does 
not attempt to match the bead against the subject string. 




3 


2 


1 
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BEAD, 

t 



The components AR in step 2 and DS in step 3 are not tried. AR cannot match since two 
characters remain in the subject string and at least three are necessary. Similarly, DS is 
not tried because one character remains and at least two were required. 

In the unanchored quickscan mode, the scanner does not move the initial p>osition of 
the cursor if insufficient characters remain in the subject string. Consider the following 
example. 

6ANCHOR = 0 

'BATS' BD 

Matching fails with the cursor initially positioned to the left of the subject string. It is 
then moved to the left of the A . Since three characters remain in the subject string, only B 
is tried. Failing to match B, the scanner recognizes that further repositioning of the cursor 
is useless. 


,B A T S 

t 



4 




2 





1 


.BATS 

t 



3 


2 


1 
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BATS 

t 






1 


In the quickscan mode, the scanner distinguishes between two kinds of failure: (1) 
failure to match, as when X is compared to T ; and (2) failure because too few characters 
remain in the subject string. In the latter case, the scanner does not allow ARB to match a 
longer substring, nor does it move the initial position of the cursor in unanchored mode. 
Consider the following pattern matching statement executed in the unanchored mode: 

'CAT' ARB 'X' 

Clearly the match cannot succeed. When the scanner reaches the state shown in the 
diagram below, ARB can no longer extend the substring it matches. ARB indicates failure 
because of too few characters. The scanner does not reposition the cursor, and matching 
fails. 



A similar situation arises in the anchored mode for such patterns as 

'CAT' ARB ARB 'X' 

The first ARB matches the null string. The second ARB matches the null string, then C , 
and finally CA before it fails for lack of room. The scanner does not seek an implicit 
alternative for the first ARB , and terminates pattern matching in failure. 
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In the quickscan mode, the scanner recognizes a special case for ARBNO . When backed 
into, ARBNO ( P ) tries to extend the substring matched by finding another instance of P . If 
P is null or has null alternatives, behavior like SUCCEED may result. The scanner tries to 
prevent this. When backed into, ARBNO(P) examines the substring matched by fhe last 
instance of P . If this substring is null, ARBNO does not try to find an additional instance 
of P , but backs up to the last instance of P and seeks an alternative to the null string. 

For example, in the quickscan mode, ARBNO (NULL ) looks like NULL | NULL . The first 
NULL appears because NULL is always attempted independently of the argument to 
ARBNO . The second NULL comes from the argument. 

Behavior of ARBNO (NULL | 'X' ) can be deduced from the output generated by the 
following statement. 

’♦XXX' ('♦' ARBNO(NULL | 'X')) $ OUTPUT FAIL 

The output is 

* 

* 

♦X 

♦X 

♦XX 

♦XX 

♦XXX 

♦XXX 

Left recursion in a pattern structure, as illustrated by 
P = ♦P 'Z' I 'Y' 

could be a problem because it might put the scanner in a loop. In the quickscan mode, 
recursive loops are broken whenever possible. Most looping problems are avoided by a 
look-ahead feature that compares the number of characters remaining with the number of 
characters required together with the assumption that any unevaluated expression matches 
at least one character. 

As an example, consider the following statement: 

'YZZ' P 

It is convenient to think that whenever the bead for ♦P is encountered, it expands into a 
bead diagram for the current definition of P. The process is illustrated by the following 
diagram. 
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When the minimum number of characters required by *P reaches four, the recursive loop 
is broken and the alternative Y is tried, leading to a successful match. 

The assumption that ♦? matches at least one character does not affect the outcome of 
the previous example. Had zero characters been assumed, one more iteration of the loop 
would have been required, and the final diagram would have been as follows. 



However, the one-character assumption keeps the following equivalent statements from 
looping. 
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'Z' 
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If both and *0 can match the null string, the bead diagram grows without limit. With 
the one-character assumption, the two equivalent examples above behave similarly. 

Assuming a one-character minimum for unevaluated expressions can lead to difficulties: 

PAT = ♦W ♦X *Y ♦Z 

The shortest string PAT matches is of length four. The following match, straightforward as 
it seems, fails. 

W = 'C 

X = ’A’ 

Y = 'T' 

Z = 

'CAT' PAT 

As seen in the next section, the match succeeds if the fullscan mode is used. 

Patterns such as BIGP , described in the section on unevaluated expressions, can produce 
unexpected results in the quickscan mode. 

BIGP = (♦P $ TRY ♦GT(SIZE(TRY),SIZE(BIG))) $ BIG PAIL 

The expression ♦GTl SIZE (TRY) ,SIZE(BIG)) is assumed to require one character when, 
in fact, it matches the null string. Therefore, the quickscan mode prevents ♦P from 
matching any substring which includes the last character of the subject string. Hence, in 
the statements 

P = SPAN(*0123456789,') 

*1234.56 789,312* BIGP 

the final value of BIG is 1234 rather than the expected 789,312. Again, as seen in the 
next section, the fullscan mode prevents such difficulties. 

In summary, the following heuristics are used in the quickscan mode to improve the 
efficiency of pattern matching: 

(1) continual comparison of the number of characters remaining in the subject string 
against the number of characters required, 

(2) repositioning of the cursor in the unanchored mode only if sufficient characters 
remain, 

(3) refusal to extend the substring matched by ARB or to reposition the cursor if 
failure is caused by too few characters, 

(4) refusal to extend substring matched by ARBNO ( P ) if the last match of P was the 
null string, and 
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(5) assumption that unevaluated expressions must match at least one character. 


2.22 Fullscan Mode 

In the fullscan mode, all heuristics to improve pattern matching efficiency are turned 
off. Each component of a pattern is matched independently of its context. Furthermore, 
when unanchored, the initial position of the cursor is moved through the entire subject 
string. The fullscan mode of pattern matching is entered by assigning a nonzero value to 
the keyword FULLSCAN. 

The following example, which prints all possible nonnull substrings of a subject, suggests 
applications of the fullscan mode. 

6ANCH0R = 0 

£FULLSCAN = 1 

'12345' (LEN(1) ARB) $ OUTPUT FAIL 

END 

Output from the program is: 

1 

12 

123 

1234 

12345 

2 

23 

234 

2345 

3 

34 

345 

4 

45 

5 

If £ FULLSCAN had been zero, the initial position of the cursor would not have been 
moved, and only the first five lines would have been printed. 

An example, which can only be done in the fullscan mode, is back referencing. This 
pattern succeeds only if a subject string has two identical nonoverlapping substrings of 
length 3: 

BACKR = LEN(3) $ X ARB *X 
The statement 

'ABCDEFGBCDA' BACKR 

succeeds and X has the value BCD . The statement above does not work in the quickscan 
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mode. WhenLENO) matches ABC, ARB eventually matches DEFGBCD and then fails 
because X is assumed to match one character. The condition is recognized in the 
quickscan mode, preventing the initial position of the cursor from being moved. Hence, 
matching fails without BCD ever being matched by LEN( 3 ) . 

In the fullscan mode, the tests of ARBNO for null arguments are turned off. 
ARBNO(NULL) and ARBNO(NULL | 'X') behave like SUCCEED. The statement 

'♦XXX' ('♦' ARBNO(NULL | 'X')) $ OUTPUT FAIL 

causes the scanner to loop, generating output lines consisting of a single ♦. 

Recursive patterns such as 

P = ♦P 'Z' I 'Y' 
do not work because the recursive loop is not broken. 

Patterns such as 

PAT = ♦W ♦X *Y ♦Z 

work for subject strings having fewer than four characters because the one-character 
assumption no longer holds. 

The following example compares the results of a program run in the quickscan and 
fullscan modes. 


2.22.1 Example 

This program prints combinations of characters taken three at a time from a subject 
string. 

define!'F(X,Y,Z)') 

COMB3 = LENd) $ A ARB LEN( 1 ) $ B ARB LEN( 1 ) $ C 
+ ♦FlA^B.C) FAIL 

'123456' C0MB3 :(END) 

F OUTPUT = X Y Z ; (RETURN) 

END 
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Output from Fullscan 


123 

123 

124 

124 

125 

125 


126 


134 


135 


136 


145 


146 


156 


234 


235 


236 


245 


246 


256 


345 


346 


356 


456 


Exercises 

Exercise 2.1: Count the total number of punctuation marks in text, given on a series of 
input cards. Define a punctuation mark to be a period, comma, colon, semicolon, question 
mark, exclamation mark, left parenthesis, right parenthesis, apostrophe, or quotation mark. 


Exercise 2.2: Write a pattern that matches words beginning with C or CR, followed by 0 
or 00 or EE , followed by P or PS . Construct the pattern so that each successful step in 
the pattern matching is explicated by a printed output. 

Exercise 2.3: Write a program that identifies and prints all words of a given length. 
Assume that the length is given on the first input card and that text occurs on following 
cards. Make appropriate assumptions about the format of the first card, the characters 
that are assumed to delimit words, and how the text appears on successive cards. 

Exercise 2.4: Let the label for an assembly language statement begin in column 1 and 
consist of at most eight characters, of which the first is a letter and the remainder are 
letters or digits. The label is followed by at least one blank. A statement without a label 
begins with a blank. Write a program that reads in assembly language statements and 
prints those statements with invalid labels. 

Exercise 2.5: Construct a pattern that matches a SNOBOL4 string literal. 

Exercise 2.6: Write a recognizer for strings of the form 
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ABC,AABBCC,AAABBBCCC, . .. . 

Exercise 2.7: Write a pattern that matches strings that are repetitions of a nonnull 
substring (e.g. 1 1 1 1 1 1 1, 123123123, abcdefabcdefabcdef). 

Exercise 2.8: Construct a pattern that matches strings that are FORTRAN format 
Hollerith field specifications (i.e. a positive integer n, followed by the letter H, and 
followed by n characters). 

Exercise 2.9: Write a program that determines whether a string is palindromic, i.e., 
whether it reads the same forward and backward (e.g. MADAM). 




CHAPTER 3 


Primitive Functions, Predicates, and Operations 


3.1 Introduction 

A function is an operation upon a number of arguments. The value of a function is 
computed by a procedure. There are two types of functions: primitive and programmer- 
defined. Primitive functions are implemented by procedures built into the SNOBOL4 
system. Procedures for programmer-defined functions, described in the next chapter, are 
included in the source program. 

Syntactically, a function call is recognized as an identifier used for a function name, 
followed by a list of arguments separated by commas and enclosed in parentheses. An 
example is 


IDENT(A,'TREE') 

An argument of a function call may be any expression. Execution of a function call causes 
the expressions for the arguments to be evaluated and the values passed to the procedure. 
Thus, the procedure gets only the values of the arguments and not the expressions. 
Consider the following statements: 

A = 'APPLE' 

B = 'SEED' 

APPLE = 'FRUIT' 

SEED = 'TREE' 

APPLESEED = 'FRUITTREE' 

IDENT($A $B,$(A B)) 

FRUITTREE is the value of each argument to IDENT. The two strings FRUITTREE are all 
that the procedure for IDENT knows of its arguments. 

A variable such as C is an expression, albeit a trivial one. Thus, if 

C = 'CLAW' 

D = 'TIGER' 


the call 


IDENT(C,D) 

passes the strings CLAW and TIGER, the values of C and D, as arguments to the procedure 
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for IDENT. Furthermore, since the procedure for IDENT knows nothing about C and D, it 
cannot possibly change their values. 

Any omitted argument is assigned the null string as value. Thus, IDENT(E) compares 
the value of E and the null string. The use of too many arguments in the call of a 
primitive function is an error. 

A function call is an expression and has a value. The value of a function call may be 
of any data type. A programmer must always be aware that a function call has a value, 
even if it is the null string. Otherwise, as later examples illustrate, unexpected results may 
arise. 

A function call may succeed or fail, depending upon the outcome of the associated 
procedure. If the procedure for a function is successful, the value computed by the 
procedure becomes the value of the function call. If the procedure fails, the function call 
fails. 

This chapter only describes a few primitive functions. Others dealing with pattern 
matching, input and output, arrays, and programmer-defined data types, and so forth, are 
described in other chapters. A list of functions appears in Chapter 7 and page references 
are included in the index. 


3.2 Numerical Predicates 

Several primitive functions are concerned with testing relations between arguments. 
These functions, which succeed or fail depending on whether the relation is true or false, 
are called predicates. If a predicate is successful, the value of the call is the null string. 


3.2.1 LT, LE, EQ, NE, GE, and GT 

A predicate, such as GE ( X , Y) , succeeds if X stands in the given relation to Y. The 
arguments to numerical predicates may be integers or reals. Thus, if 

X = 17.0 

Y = ' 3 ’ 

then GE ( X , Y) succeeds and LT (X , Y) fails. If an argument is omitted, it is assigned the 
null string, which is treated as zero. If M is 2 , then EQ (M) fails, but EQ (M - 2) succeeds 
and returns the null string. If a real number is compared with an integer, the integer is 
converted to real before the comparison. 

Numerical predicates frequently are used for loop control. For example, if N has as its 
value the number of times a loop has been executed and M is the limit on N , the following 
statement checks N against M , and increments N if it is less than M . 

N = LT(N,M) N + 1 ;S(LOOP)F(OUT) 
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Evaluation of the object expression takes place before assignment is made. Thus, the 
evaluation ofLT(N,M) takes place before N is incremented. IfLT(N,M) succeeds, the 
value is the null string. Concatenation of the null string with N + 1 does not affect N + 1, 
so N is properly incremented. Furthermore, since the statement succeeds, control passes to 
LOOP. If LT (N, M) fails, N is not incremented and control passes to the statement labelled 
OUT. 

Placement of predicates in a statement is important. Consider the following statement, 
which looks as if it might be suitable for loop control. 

N LT(N,M) = N + 1 :S(LOOP)F(OUT) 

The statement does not properly increment N . If N is 2 and M is 4, the value of N after 
execution of the statement is 3 2 . The predicate LT ( N, M ) , situated where it is, is treated 
as a pattern. Since LT(N,M) is null, the pattern matches the null string. The null string 
matched in the value ofN is replaced by N + 1 , leading to the unexpected result 32. 


3.2.2 INTEGER 

It is frequently desirable to test whether the value of a variable is an integer. The 
predicate test INTEGER(X) succeeds if the value of X is an integer and fails otherwise. 
Thus, 

INTEGER(X) 


succeeds for 

X = 3 

X = '3' 

but fails for 

X = 'INT' 

X = '3.0' 

INTEGER is typically used to check data coming from the input stream. The following 
statements reject cards which do not contain a single integer left justified on the card. 

&TRIM = 1 

LOOP CARD = INPUT :F(END) 

INTEGER(CARD) :S(PROCESS)F(REJECT) 


Since the null string is equivalent to the integer 0, a blank card passes the integer test. 
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3.3 Object Comparison Predicates 

There are several types of data predefined in the SNOBOL4 language. Programmer- 
defined data types can be added, as described in Chapter 5. Some data values, such as 
numbers, can be represented in different ways as different types of data. SNOBOL4 
includes predicates to test whether two objects are identical or different. 


3.3.1 IDENT and DIFFER 

IDENT and DIFFER are functions of two arguments which may be of any data type. For 
the function call IDENT (X,Y) to succeed or for DIFFER (X,Y) to fail, the values of the 
arguments, X and Y , must be truly identical. The value of a function argument is a 
pointer to a data object or, in the case of integers and real numbers, the value is the data 
object itself. 

Each distinct string of characters appears in storage once and only once. Execution of 

X = 'BCD’ 

Y = ’ B ’ ' CD ’ 

results in X and Y having the same value. The string BCD appears once, and both X and 
Y point to it. IDENT(X,Y) therefore succeeds. 

Pattern structures behave differently. Execution of the statements 

X = A I B 

Y = A I B 

constructs two equivalent but physically distinct pattern structures. Thus, X and Y have 
different values, since they point to different copies of the pattern structure A | B. 
IDENT(X,Y) therefore fails. 

However, if 

X = A I B 

Y = X 

then IDENT{X,Y) succeeds since X and Y point to the same data object. 

Integers and real numbers are data objects rather than pointers to data. Execution of 

X = 3 

Y = 2 + 1 

assigns 3 to both X and Y. Comparison of X and Y by IDENT(X,Y) succeeds because the 
data objects are identical. Similarly, if 

X = 3.0 

Y = 3.0 


then IDENT (X, Y) succeeds. 
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IDENT and DIFFER must be used with care when their arguments have different data 
types. If 

X = 3 

Y = ' 3' 

EQ (X, Y) succeeds as illustrated earlier because numeral strings are acceptable in 
arithmetic contexts. IDENT(X,Y) fails because the value of X is an integer, but the value 
of Y is a string. 

Similarly, for 

X = 3.0 

Y = 3 

IDENT(X,Y) fails because the values are not identical. 


3.3.2 LGT 


Lexical ordering of strings can be tested using the predicate LGT (X, Y) . LGT (X, Y) 
succeeds, returning the null string, if the value of X is lexically greater than Y. Stated 
another way, LGT(X,Y) succeeds if X follows Y alphabetically. The order of the 
characters is implementation dependent. For example, on the IBM 360 the EBCDIC 
encoding is used with the blank preceding letters and letters preceding digits. The value of 
8ALPHABET is a string of all characters in lexical order. 

Consider, as an example, the problem of alphabetizing the characters in a string. That 
is, the string LABORATORIES is to be transformed into the string AABEILOORST. The 
following program performs the transformation using an interchange sorting technique. 

6ANCHOR = 1; 6FULLSCAN = 1 

FLIP = LEN(*I) . HEAD LEN(1) $ X 
STR = 'LABORATORIES' 

OUTPUT = STR 
LIMIT = SIZE(STR) - 2 

LOOP STR FLIP = HEAD Y X 

I = LT(I,LIMIT) I + 1 

1 = 0 

LIMIT = GT(LIMIT,0) LIMIT - 1 

OUTPUT = STR 

END 

Output is 

LABORATORIES 
AABEILOORRST 

Sorting is performed in the statement labelled LOOP. The pattern FLIP matches two 
adjacent characters, assigning them as values of the variables X and Y. ♦LGT(X,Y) is 
then evaluated to test the order of the characters. If the order is correct, LGT (X, Y ) fails 


LEN(1) $ Y ♦LGTfX.Y) 

:S(LOOP) 

:S(LOOP) 
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and the characters are not interchanged. If LGT (X, Y) succeeds, pattern matching succeeds 
and the two characters are interchanged as the replacement is performed. 

The variable I is used to determine which two characters are compared by LGT (X, Y) 
and the variable LIMIT controls the number of passes made over the string STR. 


3.4 Additional Primitive Functions 
3.4.1 SIZE 

SIZE is a function that determines the length of a string. Its value is the number of 
characters in its argument. 


3.4.2 REPLACE 

One-for-one character replacement in a string may be accomplished using the function 
REPLACE . The value of REPLACE (X , Y, Z ) is the string resulting from replacement in X 
of each character appearing in Y by the corresponding character in Z. As a result of 
executing the following statements, 

BINARY = '111001' 

ONESCOMP = REPLACE(BINARY,'01','10') 

ONES COMP has the value 0001 10, obtained from BINARY by replacing all zeroes with 
ones, and ones with zeroes. The value of the variable BINARY remains unchanged. 

REPLACE normally succeeds, but it fails if 

(1) the second and third arguments have different lengths, or 

(2) the second or third argument is null. 

Multiple occurrences of characters in the third argument are valid. Thus, 

REPLACE(S,'.,j:?!', ' ' ) 

replaces all punctuation marks with blanks. 

In the case of the multiple occurrence of a character in the second argument, the 
rightmost correspondence holds. Thus, following execution of the statement 

TEXT = REPLACE('FEET','EE','AO') 

the variable TEXT has value FOOT. 

An example of the usefulness of REPLACE is the following program that converts a deck 
of cards prepared on an 026 keypunch (BCD) to a deck using 029 keypunch code 
(EBCDIC). 
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LOOP PUNCH = REPLACE(INPUT, '#aX<6 ',”='()+”) ;S(LOOP) 

END 

029 graphics are used in the example above. 


3.4.3 TRIM 

TRIM is a primitive function whose argument must be a string or an integer. The value 
of TRIM is a string which is the argument value with all trailing blanks removed. Thus, 
the statements 

TEXT = 'A PRIMITIVE FUNCTION 
SHORTTEXT = TRIM(TEXT) 

gives SHORTTEXT the value A PRIMITIVE FUNCTION. The value of TEXT is not 
changed. 

It is not necessary to use the function TRIM to delete trailing blanks from input since 
this operation is performed automatically under the control of STRIM. 


3.4.4 DUPL 

DUPL is a primitive function of two arguments, the first of which is a string and the 
second an integer. The value of DUPL is a string consisting of the first argument repeated 
the number of times specified by the second argument. 

DUPL is used in many of the examples in Chapters 1 and 2 to generate strings of blanks 
for formatting output lines. A typical example is the following one where a string is right 
justified to column 50. 

OUTPUT = DUPLC ',50 - SIZE(STR)) STR 

Another use of DUPL is to generate, with one line of program, several lines of output. 
For example, if the page width for printing is 132 characters, five lines containing periods 
in columns 10 and 50 can be generated by 

OUTPUT = DUPL(DUPL(' ’,9) DUPL(' ',39) '.' DUPLC ',82), 


3.4.5 REMDR 

REMDR is a primitive function of two integer arguments. The value of REMDR is an 
integer that is the remainder of dividing the first argument by the second. Thus, 

= REMDR(15,4) 


OUTPUT 
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prints the integer 3. The sign of REMDR is the sign of the dividend or first argument. 
Thus, 

REMDR(-15,4) 

REMDR(15,-4) 

REMDR(-15,-4) 

have the values -3,3, and -3 respectively. 


3.4.6 DATE and TIME 

DATE and TIME are primitive functions requiring no arguments. The value of DATE ( ) 
is an 8-character string of the form month/day/year. For July 11, 1970 the value of 
DATEO is 07/1 1/70. 

The value of TIME () is an integer that is the elapsed time in milliseconds from the 
beginning of program execution. Compilation time is not included. On IBM 360 
equipment, the standard interval clock is updated only sixty times a second, so timing is 
approximate at best. 


3.4.7 EVAL 

EVAL is a primitive function whose argument must be an unevaluated expression, string, 
integer, or real. If the argument is an unevaluated expression, the expression is evaluated 
to obtain the value of EVAL. If the argument is a string, the value of EVAL is the value 
of the expression represented by the string. Integer and real arguments are simply returned 
as values without modification. 

In the following example, the value of S is a string, and the value of U is an 
unevaluated expression. Both output statements print the integer 15. 

S = 'X + SIZE(X) ♦ 10' 

U = *(X + SIZE(X) ♦ 10) 

X = 5 

OUTPUT = EVAL(S) 

OUTPUT = EVAL(U) 

Any string or unevaluated expression that is a syntactically correct expression in 
SNOBOL4 may be evaluated by EVAL. A syntactic error in the argument of EVAL causes 
failure of EVAL. Thus, evaluation of E in the statements 

E = '5+6' 

SUM = EVAL(E) 


fails since blanks are required around the +. 
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3.4.8 APPLY 

APPLY is a primitive function that creates and executes a function call. 
APPLY (F,a,,...,an) calls the function F with the arguments aa „ . The value 
of APPLY is the value returned by the function it calls. The function F may be any 
function, primitive or programmer-defined. Use of APPLY on a primitive function must 
specify the correct number of arguments. Omission of trailing arguments is not permitted. 

An important use of APPLY is to call various functions, depending on the current value 
of data. Execution of the statements 

X = ’REMDR' 

Y = 6j Z = 5 

OUTPUT = APPLY(X,Y,Z) 

calls REMDR (6,5) and prints 1 . Subsequent execution of 

X = ’DUPL’ 

OUTPUT = APPLY(X,Y,Z) 

calls DUPL (6,5) and prints 66666. 


3.5 Negation (-i) and Interrogation (?) 

Two predicates, specified by the unary operators -i and ?, test the success or failure 
resulting from evaluation of expressions. The negation operator -i fails if its operand 
succeeds, and succeeds if its operand fails. A null string is returned as value on success. 
The interrogation operator ? is the converse of -i. It succeeds, returning the null value, if 
its operand succeeds. It fails if its operand fails. 

Negation may be used to complement a predicate. For example, the following program 
reads an input deck and prints those cards that contain integers. 

£TRIM = 1 

LOOP CARD = INPUT 

OUTPUT = INTEGER(CARD) CARD 

END 

Suppose the converse program, one that prints all cards that are not integers, is desired. 
No predicate is available which succeeds when its argument is not an integer. However, 
the negation operator together with the predicate INTEGER suffices. Thus, the following 
program lists all noninteger cards. 

STRIM = 1 

LOOP CARD = INPUT ;F(END) 

OUTPUT = -iINTEGER (CARD) CARD ; (LOOP) 


: F ( END) 
: (LOOP) 


END 
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Complicated Boolean functions on the states of variables can be constructed using 
predicates and negation. For example, suppose the integer N is to be incremented, 
provided at least one of the variables X , Y , or Z is null. The following statement tests the 
variables and, if the condition is satisfied, increments N. 

N = -.(DIFFER(X) DIFFER(Y) DIFFER(Z)) N + 1 

If X , Y , and Z are nonnull, the expression succeeds but the negation operation fails, and N 
is not incremented. If any variable is null, the corresponding DIFFER fails, causing the 
expression to fail. Negation succeeds and N is incremented. 

Interrogation is used primarily to convert a function that returns a nonnull value into a 
predicate that succeeds or fails, but returns a null value. Thus, in the following statement, 
N is incremented if F ( X ) succeeds, but the value of F ( X ) is not concatenated with N + 1. 

N = ?F(X) N + 1 ;S(ON)F(OUT) 


3.6 External Functions 

The primitive functions described in this chapter and elsewhere in this book are built 
into the SNOBOL4 language. There is a facility whereby a library of functions, called 
external functions, can be accessed during program execution [4]. External functions are 
written in FORTRAN or assembly language and provide a means for extending the 
SNOBOL4 language. External functions are typically used for: 

(1) Complicated numerical computations. 

(2) Specialized input and output operations. 

(3) Creating and operating on data types not provided by the SNOBOL4 language 

itself. 

Since external functions are not part of the SNOBOL4 system, the availability and 
access to various external functions varies from installation to installation. This chapter 
does not describe how to create external functions, but only how to use those that may be 
available. The external functions described in the following sections are hypothetical, and 
are not necessarily available at any installation. 


3.6.1 Loading and Calling External Functions 

To use an external function, it must be added to the SNOBOL4 system by loading it 
during program execution. This is accomplished by executing the primitive function 

LOAD(prototype,library) 
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The prototype describes the external function which is then loaded from the given 
library. The prototype of the function consists of the function name, the data type for 
each argument, and the data type of the value returned. In the prototype, the name comes 
first, followed by the argument data types within parentheses, and finally the data type of 
the returned value. For example 

LOAD('SIN(REAL)REAL','SNOLIB') 

loads an external function SIN which expects one real argument and returns a real value. 
The function is loaded from the library SNOLIB. The second argument may be omitted, 
implying a standard (public) library. 

After execution of the LOAD function, the function SIN is available for use like any 
primitive function. For example, 

Z = SIN(0.79) 

assigns 0.7104 to Z, assuming SIN expects its argument in radians. 

The data types in the load prototype must correspond to those expected by the external 
function. 

Another example of an external function is 

LOAD('LOG(REAL,INTEGER)REAL') 

where LOG is an external function in which the second (integer) argument specifies the 
base of the logorithm. 

In the call of an external function, omitted trailing arguments are taken to be null 
strings. Extra arguments are ignored. 

External functions may introduce new data types into the SNOBOL4 system. The load 
prototype may specify these new data type names. A typical example is an external 
function that creates bit strings having data type BIT. 

LOAD('BITSTRING(STRING)BIT') 

An example of use is 

X = BITSTRING('111001r) 

which converts a string of characters into a corresponding string of bits. 

In load prototypes, data type specifications unknown to SNOBOL4 system are ignored 
and left to be handled by the external functions. Such data types can be omitted from the 
prototype altogether: 

LOAD('BITSTRING(STRING)') 

is equivalent to the example above. 
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3.6.2 Unloading Functions 

External functions may be unloaded when they are no longer needed. This is 
accomplished by the primitive function UNLOAD, which unloads the named function. For 
example 

unload!'sin') 

would unload SIN described in the previous section. When an external function is 
unloaded, it becomes ‘undefined’ and the space it occupies is freed. 

UNLOAD is not restricted to external functions: Any function can be unloaded and 
consequently undefined. The space occupied is reclaimed only for external functions, 
however. 


3.7 OPSYN and Operator Definition 

It is sometimes convenient to provide synonyms for existing functions or operators. The 
primitive function OPSYN can be used for this purpose. The general format of OPSYN is 

OPSYN(new,old,n) 


3.7.1 Function Synonyms 

If n is omitted (or zero), new and old are treated as function names and the new 
function name becomes a synonym for the old function. For example, 

OPSYN('SAME','IDENT') 
makes SAME a synonym for the function IDENT. 

A call using a synonym for a primitive function must have the correct number of 
arguments. Trailing arguments may not be omitted. For example, 

SAME(X) 
is erroneous. 


3.7.2 Operator Synonyms 

If the third argument of OPSYN is 1 , the two arguments are interpreted as unary 
operators if possible. For example, 

OPSYN( 
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makes the unary operator $ a synonym for ♦. After execution of this function call, $X 
behaves like *X . 

If either argument is not one of the unary operators, it is interpreted as a function 
name. Therefore 

OPSYNl,'F',1) 

makes $ a synonym for the function P. Subsequently $X results in a call of F (X) . 
Conversely, 

OPSYN('F*,,1) 

makes the function F a synonym for the unary operator *. Subsequently P (X) behaves 
like ♦X . 

If neither argument is a unary operator, the effect of OPSYN is the same as if the third 
argument were zero. 

If the third argument is 2, the arguments are interpreted as binary operators if possible. 
The result is similar to the case for unary operators. Thus 

OPSYN( ’ + ’ , ’F\2) 

causes N -t- M to call F (N, M ) . 

Any type of function can appear as an argument to OPSYN. However, an invocation of 
a primitive function by a synonym must have the correct number of arguments. 

Operators given in the first two arguments of OPSYN must be given precisely. In 

OPSYNf’Ir,'F',1) 

the first argument is not interpreted as a unary operator even though it starts with one. If 
the binary operator of concatenation is to be OPSYN ed, it must be given as a single blank. 


Any string that cannot be interpreted as an operator is taken to be a function name 
without checking that the string meets the syntactic requirements for a function name. 
The first argument, | | , in the call of OPSYN above is an example. Such functions cannot 
be called explicitly, but may be invoked implicitly by APPLY since APPLY does not check 
the syntax of its first argument either. 


3.7.3 Summary of Operators 

In addition to the operators with defined meanings, a number of undefined operators 
are provided. The operators can be defined through the use of OPSYN. Tables 
summarizing operators follow. 



Graphic 



Definition 


negation 
interrogation 
indirect reference 


unevaluated expression 


positive 
negative 
cursor position 
none 

keyword 



Binary Operators 


Graphic 

Definition 

Associativity 

Precedence 


none 

right 

12 

? 

none 

left 

12 

$ 

immediate value assignment 

left 

11 

• 

conditional value assignment 

left 

11 

1 

exponentiation 

right 

10 

X 

none 

left 

9 

* 

multiplication 

left 

8 

/ 

division 

left 

7 


none 

left 

6 

+ 

addition 

left 

5 

- 

subtraction 

left 

5 

a 

none 

left 

4 

blank 

concatenation 

left 

3 

1 

alternation 

left 

2 

£ 

none 

left 

_1 

1 


Note that the sets of unary and binary operators are the same except for concatenation 
and the alternative graphic for exponentiation. 
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Any operator that does not have a primitive definition can be defined using OPSYN. 
Similarly the primitive definition of any operator can be changed. The precedence and 
associativity of binary operators is not changed by their redefinition. 


Exercises 

Exercise 3.1: Assume source cards to an assembler have fields beginning in columns 1, 8, 
16, and 36. Write a program to reformat these cards so that the fields begin in columns 1, 
10, 20, and 40. Pad with blanks to achieve this, and keep only the first 80 characters of 
the result. 

Exercise 3.2: Assume source cards to an assembler consist of four fields in free format 
separated by blanks. The first field always begins in column 1. Typical cards might be 

X L 1,0(1) LOAD 

Y ST 1,0(3) SAVE 

Write a program to reformat these cards with the four fields starting in columns 1, 10, 
20, and 40. Assume the fields are short enough to fit in this format. This program should 
allow the first and the last fields to be omitted, but assume that the second and third 
fields are always present. 

Exercise 3.3: Refine the program in Exercise 3.2 above to allow for the possibility of a 
quoted literal (as in SNOBOL4) in the third field, and assume such literals may contain 
imbedded blanks. Assume that quotation marks do not occur in other fields. For simplicity 
assume only one kind of quotation mark. Furthermore, comment cards, identified by an 
asterisk in column 1, should be passed through unchanged. 

Exercise 3.4: Let a sequence of integers be given left justified on data cards. Write a 
program that reads in the cards and prints the largest number. 

Exercise 3.5: Write a program that reads in an integer N from a data card, and prints the 
value of N factorial. Have the program test the data and print an appropriate comment if 
the input is not a positive integer. 

Exercise 3.6: A permutation can be thought of as a rearrangement of objects. Thus a 
permutation can be represented by a pair of strings such as ( abc, cab ) indicating that a 
is replaced by c , b is replaced by a , and c is replaced by b . 

If the permutation is applied to the first string in the pair, the index of the permutation 
is the number of such iterated applications required to obtain the original string. For 
example, the index of ( abc , cab ) is three. 

Write a program that reads in a permutation and determines its index. 

Exercise 3.7: Any string of a specified length can be reversed with a single call of the 
function REPLACE . Write a call of REPLACE that reverses any string of length 7. (The 
authors are indebted to Mr. Morris M. Siegel for calling this technique to their attention.) 



Exercises 91 


Exercise 3.8: Define a unary operator that is synonymous with the primitive function 
SIZE. 

Exercise 3.9: Define a new operator for addition which has higher precedence than 
subtraction. 




CHAPTER 4 


Programmer-Defined Functions 


4.1 Introduction 

A programmer may define his own functions. A program with programmer-defined 
functions must include: 

(1) a call to the primitive function DEFINE for each programmer-defined function, 
and 

(2) a procedure, written in SNOBOL4, for each function. 


4.2 The Primitive Function DEFINE 

Procedures are written using formal arguments, and must adhere to special conventions 
for returning. Execution of the primitive function DEFINE communicates to the 
SNOBOL4 system: 

(1) the name of the function, 

(2) a list of formal arguments used in the programmer-defined procedure, 

(3) a list of variables local to the programmer-defined procedure, and 

(4) the entry point of the procedure. 

Programmer-defined functions are defined by executing the primitive function DEFINE. 
The first argument of DEFINE is a prototype for the call of the function being defined. 
The second argument is a label specifiying the entry point to the programmer-defined 
function. For example, execution of 

define!'F(X,Y)L1,L2','FENTRY') 

defines a function F with two formal arguments, X and Y. Two local variables LI and L2 
are used in the procedure whose entry point is the statement labelled FENTRY. Extra 
commas in the lists of formal arguments and local variables are ignored. 
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Often local variables are not needed, so it is permissible to omit the list of local 
variables. An example is 

define!'G(Z)’,'GENT') 

It is also permissible to omit the second argument, in which case the entry point is 
assumed to be the same as the function name. Thus, 

define!'count!n)') 

defines the function COUNT with entry point COUNT. Functions can be defined without 
any formal arguments. For example, 

define!'mark!)') 

defines the function MARK with no formal arguments. 

The DEFINE function returns the null string as value. 


4.3 Procedures for Programmer-Defined Functions 

A procedure for a programmer-defined function is a set of SNOBOL4 statements. The 
label, given explicitly or implicitly in the arguments of the associated DEFINE function, 
specifies the statement to which control is passed when a call is made to the function. 
Thus, during execution of the statement labelled ZSET in the example below, the call to 
UNION causes control to be passed to UN. Execution of ZSET is temporarily suspended 
while the value of UNION is being computed. Once the value of UNION has been 
computed, control returns to statement ZSET where computation is resumed, using the 
value returned. 

define!'union!x,y)ch','UN') 


ZSET Z = SET1 union!SET2,SET3) SET4 


UN UNION = X 

ULOOP Y LEN!1) . CH = 

UNION BREAK!CH) 

UNION = UNION CH 

The defining statement must be executed before the call is made. The procedure is 
called and should not be flowed into. The procedure may be transferred around or placed 
out of the way of program flow. 

The statements constituting the procedure are written using the formal arguments, whose 


F!RETURN) 
S!ULOOP) 

!ULOOP) 
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values are supplied by arguments of a call. 

Variables used in procedures should be declared local if their values outside the 
procedure should not be changed by the function call. In UNION, the value of the 
variable CH changes continually during evaluation. The global value of CH might be 
altered as a result of the call if CH were not declared as a local variable. Upon entry to a 
procedure, all local variables are given null string values. 

All statement labels, including labels in procedures, are global. Transfer can be made 
from a statement in one procedure to a statement in another. 

The name of a function may be used as a variable in the procedure. The value of the 
function call is the current value of the function name when the procedure returns. Thus, 
in the example above, the value of the call UNION (SET2, SETS ) is the value of the 
variable UNION when the statement ULOOP fails, causing return to the statement ZSET. 

Return of control from a procedure to the calling statement is accomplished by 
transferring to one of the three system labels: RETURN, FRETURN, orNRETURN. 


4.3.1 RETURN 

Transfer to RETURN indicates that the function call is successful. The value of the 
function call is set to the value of the function name. Execution continues in the calling 
statement using the returned value. 


4.3.2 FRETURN 

Transfer to FRETURN indicates failure of the function call. 

An example using both RETURN and FRETURN is the function PAL , which checks its 
argument to see if it is a palindromic string. PAL compares the argument string and its 
reverse. If they are identical, PAL transfers to RETURN , indicating success. Otherwise PAL 
transfers to FRETURN , indicating failure. 

6TRIM = 1 

DEFINE('PAL(STR)CH,S1,S2') 


TEST 


PHRASE = INPUT 
PAL(PHRASE) 


:F(ENO) 

:S(GOOD)F(NOGOOD) 
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PAL SI = STR 

PALL SI LEN(1) . CH 

S2 = CH S2 

PTEST IDENT(STR,S2) 


P(PTEST) 

(PALL) 

S(RETURN)F(FRETURN) 


END 


4.3.3 NRETURN 

By transferring to the label NRETURN , a programmer-defined function may return a 
computed name rather than a value. A call to a function that returns a computed name 
may be used as the subject of an assignment statement. For example, 

F(X,Y) = X Y 

is a valid statement provided the function F returns by name using NRETURN . A further 
description of names is included in Chapter 6. 


4.4 Execution of Programmer-Defined Functions 

When a call to a programmer-defined function is made, the arguments of the call are 
evaluated first. Before execution of the procedure begins, the values of the following 
variables are saved in the following order: 

(1) the name of the function, 

(2) all formal arguments, and 

(3) all local variables. 

New values are assigned to these variables as follows: 

(1) the name of the function is assigned the null string, 

(2) the formal arguments are assigned their values, and 

(3) all local variables are assigned the null string. 

Consider the function UNION specified in the defining statement 
define!'UNI0N(X,Y)CH','UN') 

and called by UNION! SET2, SET3 ) . Values of the variables UNION, X, Y, and CH at the 
time of a call are saved. New values for these variables are assigned as if the following 
statements had been executed. 





96 Programmer-Defined Functions 


4.4.1 


UNION = 

X = SET2 
Y = SET3 
CH = 

Then control passes to the statement labeled UN. 

When return from a procedure is made using RETURN, 

(1) the value of the function call is set to the value of the function name, and 

(2) the values of all variables saved at the time of the call are restored in reverse 
order. 

When return is made using FRETURN, 

(1) the values of all variables saved at the time of the call are restored, in reverse 
order, and 

(2) the call fails. 

When return is made using NRETURN, 

(1) the function call becomes a variable whose name is taken from the value of the 
function name, and 

(2) the values of all variables saved at the time of the call are restored, in reverse 
order. 

A programmer-defined function may be called with more or fewer arguments than 
specified in the corresponding defining statement. If too few arguments are specified, the 
trailing omitted arguments are assigned null strings. If too many arguments are specified, 
the extra arguments are evaluated, but their values are ignored. 


4.4.1 Example - Union, Intersection, and Negation 

This example includes three functions that perform the union, intersection, and negation 
of sets of characters, and a short test program. Notice that the procedures follow the 
defining statements in the listing. However, by transferring around the procedures, the 
defining statements are executed one after another. The test program then makes calls to 
the procedures. 

START 

UNION define!'UNION(X,Y)CH','UN') :(INTER) 

UN UNION = X 

ULOOP Y CHAR = :F(RETURN) 

UNION CHTEST :S(ULOOP) 
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UNION = 

UNION CH 

!ULOOP) 

INTER 

define!' 

inter!x,y)ch','in') 

!neg) 

IN 

X CHAR 

= 

F!RETURN) 


Y CHTEST 

p!in) 


INTER = 

INTER CH 

!in) 

NEG 

define!' 

NEG!X)CH,HEAD','NG') 

!PATDEF) 

NG 

NEG = 

UNIVERSE 


NLOOP 

X CHAR 

= 

iF!RETURN) 


NEG CHLOC = HEAD 

!NLOOP) 

PATDEF 

CHAR = 

LEN!1) . CH 



CHTEST 

= BREAK!*CH) 



CHLOC = 

BREAK!*CH) . HEAD LEN!1) 


TEST 

UNIVERSE = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ' 



VOWELS 

= 'AEIOU' 



OUTPUT 

= 'VOWELS = "' VOWELS '"' 



CONS = 

NEG!VOWELS) 



OUTPUT 

= 'CONS = "' CONS '"' 



WORD = 

'COMPILER' 



OUTPUT 

= 'VOWELS IN "' WORD '" ARE; 


+ 


INTER!WORD,VOWELS) "" 



OUTPUT 

= 'CONSONANTS IN '" WORD '" ARE; 

If I 

+ 


INTER!WORD,CONS) "" 



OUTPUT 

= 'UNION OP "NIGHT" AND "HOWL" IS 

If 1 

+ 


UNION!'NIGHT','HOWL') "" 



END 

Output from the program is: 

VOWELS = "AEIOU" 

CONS = "BCDPGHJKLMNPQRSTVWXYZ" 

VOWELS IN "COMPILER" ARE; "OIE" 
CONSONANTS IN "COMPILER" ARE; "CMPLR" 
UNION OF "NIGHT" AND "HOWL" IS "NIGHTOWL" 


4.5 Redefinition of Programmer-Defined Functions 

Programmer-defined functions can be defined at any time during execution of a 
program. It is also possible, having defined a function at one point, to redefine it at a 
subsequent time. For example, a function can be initialized when it is first called, and its 
entry point redefined for subsequent calls. 

In the previous example, the pattern CHLOC is used only in the function procedure for 
NEG. The following procedure for NEG includes the assignment statement that constructs 
CHLOC. 

NG CHLOC = BREAK(*CH) . HEAD LEN(1) 

DEPINEf'NEG(X)CH,HEAD','NG2') 

NG2 NEG = UNIVERSE 
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NLOOP X CHAR = 

NEG CHLOC = HEAD 


:F(RETURN) 
: (NLOOP) 


The original DEFINE for NEG specifies the starting label NG. On the first call of NEG, 
the pattern structure for CHLOC is constructed and NEG is redefined with the starting label 
NG2 . Subsequent calls of NEG do not execute the statement that constructs CHLOC. 


4.6 Recursive Functions 

Many functions are conveniently defined recursively. For example, factorials may be 
defined as 

fact(O) = 1 

fact(n) = n*fact(n-1) for n > 0 

Using Pascal’s triangle, a recursive definition for the binomial coeflScients is easily 
deduced. 


1 


1 1 


1 2 1 

13 3 1 

1 4 6 4 1 

1 5 10 10 5 1 


binc(n,0) = 1 

binc(n,n) = 1 

binc(n,k) = binc(n-1,k-1)+binc(n-1,k) 0 < k < n 

A recursive procedure has the property that the function itself is called in the 
procedure. While convenient, recursive procedures may lead to computational inefficiencies. 
Nevertheless, recursion is frequently the most natural way of expressing a function, and 
may considerably simplify programming. 

Programmer-defined functions in SNOBOL4 may be recursive. Since values of the 
function name, arguments, and local variables are all saved when a function is called, a 
procedure can include recursive coding. 
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4.6.1 Example - Decimal to Binary Conversion 

The next program converts decimal integers to their binary representation by successive 
divisions. For example, to compute the binary representation of 57, it is repeatedly 
divided by 2 and the remainders are concatenated. 

2 ) 57 
2 ) 28 
2 ) 14 

2 )_7 

2 )_3 

2 )_ 1 

1110 0 1 remainders 

57,0 = IIIOOI 2 

The binary representation of 5 7 is the binary representation of 28 (IIIOO 2 ) followed by 
the remainder of 57/2. A recursive definition of the process is 

binary(57) = binary(28) remainder(57/2) 

where concatenation is implied. 

In SNOBOL4, the results of integer division are truncated. Thus, 57 / 2 is 2 8. Thus, 
the recursive definition can be written in the more general form 

binary(n) = binary(n/2) remdr(n,2) for n > 1 

(where remdr (n, 2) is the remainder of n/2) with the terminal cases 

binary!1) = 1 

binary(O) = 0 

A defining statement and procedure for BINARY are 

DEFINE('BINARY(N)') :(BINEND) 

BINARY BINARY = GT(N,1) BINARY(N / 2) REMDR(N, 2) 

-I- : S (RETURN) 

BINARY = N :(RETURN) 

BINEND 

On entry to BINARY, the value of N is tested by the predicate GT(N, 1) , which fails for 
the two terminal cases N = 0 and N = 1 . Thus, the first statement fails for terminal cases 
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and N is returned as the value of BINARY. If N is greater than 1 , a recursive call is made 
to BINARY with N / 2 as the argument. The value of BINARY(N / 2) then has the 
remainder of N / 2 concatenated with it, to get the final value of BINARY (N) . 

The following diagram illustrates the recursive calls made during evaluation of 
binary! 57). The recursion plunges six levels before reaching the terminal case of 
N = 1 . On returning, the value of BINARY evolves. 


binary!57) 



N = 57 BINARY = 111001 



N = 28 BINARY = 11100 



N = 14 BINARY = 1110 



N = 7 BINARY =111 
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C_) 


It is important to notice the necessity of preserving values before a function call and 
restoring them upon completion. At the first level down, BINARY(28) is called with N 
having value 57. During the course of evaluating BINARY(28) , N takes on values 28, 
14,7,3, and 1 . Following evaluation of BINARY (28) , N must once again have the value 
57 in order to compute the remainder of 5 7 / 2. 

An improvement is possible in the definition of BINARY. SNOBOL4 permits use of a 
function name as one of the formal arguments in a function definition. Thus, 

define!'binary(binary)') 

is a valid statement. The procedure of BINARY can be rewritten substituting BINARY for 
N. 


BINARY BINARY = GT(BINARY,1) BINARY(BINARY / 2) 

+ RBMDR(BINARY,2) :(RETURN) 

The second statement would become 

BINARY = BINARY 

which is redundant. For the terminal cases recognized by the failure of GT(BINARY, 1) , 
BINARY has the proper value, 0 or 1 , and an unconditional RETURN is made. 

define!'binary(binary)') 



OUTPUT 

= 

' 0 

= 

• 

binary!0) 



OUTPUT 

= 

' 13 

= 

• 

binary!13) 



OUTPUT 

= 

' 57 

= 

• 

binary!57) 



OUTPUT 

= 

' 472 

= 

1 

binary!472) 



OUTPUT 

= 

' 8192 

= 

• 

binary!8192) 



OUTPUT 

= 

' 13279 

= 

• 

binary!13279) 



OUTPUT 

= 

'99999 

= 

• 

binary!99999) 

:!END) 

BINARY 

BINARY 

= 

GT(BINARY, 

1) 

BINARY(BINARY / 

2) 

+ 



REMDR(BINARY 

,2) 

:(RETURN) 

END 
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0 

= 

0 

13 

= 

1101 

57 

= 

111001 

472 

= 

111011000 

8192 

= 

10000000000000 

13279 

= 

11001111011111 

99999 

= 

11000011010011111 


4.6.2 Example - Polish to Infix Translation 

Arithmetic expressions such as 

X + y 
A / B / C 
V-W-X + Y*Z 

are written using an infix notation. They can also be written in Polish prefix notation 
[5,6], resembling conventional functional notation. Here the binary operators appear to the 
left of their arguments. Prefix notation for the expressions is 

+(X,Y) 

/(/(A,B) ,C) 

+ (-(-(V,W),X),*(y,z)) 

Conversion from Polish prefix form to infix form, and vice versa, can be performed 
using recursive programmer-defined functions. The first of the two following programs 
converts strings from Polish to infix form. The recursive rules for specifying the function 
INF are: 

(1) If the argument to INF is a simple variable, then 

INF (VAR) = VAR 

(2) If the argument to INF is a Polish expression of the form 0P(EX1 ,EX2) , then 

INF(OP(EX1,EX2)) = (INF(EXI) OP INF(EX2)) 

The conversion consists of finding the operator and its two arguments, which may be 
expressions. The operator is inserted between its two arguments and parentheses are placed 
around the resulting expression. Of course, the arguments are still in Polish form, so each 
must be converted to infix by a recursive call of INF. 

The following diagram depicts the conversion of / (/ ( A , B ), C) to (( a/B ) /C ) . 
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INF('/(/(A,B),C)') 



/ ( /(A,B) , C ) ((A/B)/C) 



A A B B 



In the following program, the procedure for INF consists of one line. The pattern 
INPAT is used to break a Polish expression into an operator and two arguments. 



If INPAT matches INF, it matches the entire string, which is then rearranged into infix 
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notation. If INPAT fails to match, INF must 
value. 

6ANCH0R = 1 ; STRIM = 

INPAT = LEN(1) .OP 
•f RPOS ( 0 ) 

define!'INF(INF)X,Y,0P') 
OUTPUT = ' PREFIX FORM' 

OUTPUT = 

LOOP STRING = INPUT 

OUTPUT = STRING DUPL(' ' 

+ 

INF INF INPAT = '(' INF(X) 

+ 

END 

Output from the program follows. 

PREFIX FORM 

-(♦(A,+(B,C)),/(D,E)) 

(A,B),C),D),E),*(F,G)) 
-(+(A,*(B,G)),/(D,P)) 


be a variable and is returned unchanged as 

1 

BAL .X ' , ' BAL . Y ' ) ' 


DUPL(' ’,25) 'INFIX FORM' 

;F(END) 

36 - SIZE(STRING)) INF(STRING) 
:(LOOP) 

OP INF(Y) ')' 

:(RETURN) 


INFIX FORM 

( (A*(B+C))-(D/E)) 

( ((((A-B)-C)-D)-E)-(F*G)) 
( (A+(B*G))-(D/P)) 


4.6.3 Example - Infix to Polish Translation 

Conversion of arithmetic expressions from infix to Polish form is harder than the 
converse. A function POL, which performs the conversion, is of the form: 

P0L(EX1 OP EX2) = OP '(' P0L(EX1) ',' POL(EX2) ')' 

Ambiguities can arise when attempting to separate an unparenthesized expression into two 
expressions and an operator. For example, the expression 

A - B ♦ C - D 

can be separated many ways, including 
(A - B) ♦ (C - D) 

(A - (B ♦ C)) - D 

Normal conventions for the precedence and association of operators require that 
multiplication and division have precedence over addition and subtraction, and that 
operators associate to the left. Thus, of the choices above, the first is incorrect because 
subtraction is given higher precedence than multiplication, and the second is correct. 

In defining the function POL, the precedence of multiplicative over additive operators 
can be assured by dealing with the additive operators first. For example: 
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POL(’W*X+Y*Z') 



W ♦ X ♦(W,X) 



W W X X 



Left association of operators is assured by selecting the rightmost operator in a string of 
operators having equal precedence. For example 
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Thus, the rules prescribing the behavior of POL are: 

(1) Remove any enclosing parentheses from the infix string. 


(2) If possible, separate the argument into two expressions balanced with respect to 
parentheses and separated by the rightmost additive operator. The value of POL then 
becomes 

OP '(' POL(EXI) POL(EX2) ')' 

If this is not possible, perform Step 3. 

(3) If possible, separate the argument into expressions balanced with respect to 
parentheses and separated by the rightmost multiplicative operator. The value of 
POL then becomes 

OP '(' POL(EXI) POL(EX2) ')' 

If this is not possible, perform Step 4. 

(4) The infix string must be a simple variable, which becomes the value of POL. 


A complete program for infix-to-Polish conversion and test results follow. 


+ 

+ 


SANCHOR = 1 ; £TRIM = 1 

PMPAT = (ARBNOfBAL ANY('+-')) $ X PAIL 

TAB(*(SIZE(X) - 1))) . X LEN(1) 

MDPAT = (ARBNOlBAL ANY('♦/')) $ X PAIL 

TAB(*(SIZE(X) - 1))) . X LEN(1) 

STRIP = '(' BAL . POL ')' RPOS(O) 

DEPINEf'POL(POL)X,Y,OP') 

OUTPUT = ' INPIX PORM* DUPL(' ’,26) 'PREPIX PORM' 

OUTPUT = 


♦DIPPER(X) 
OP REM . Y 
♦DIPPER(X) 
OP REM . Y 


LOOP STRING = 

INPUT 


:P(END) 

OUTPUT = 

STRING 

DUPL(' ',36 - SIZE(STRING)) POL(STRING) 

+ 



:(LOOP) 

POL POL STRIP 



:S(POL) 

POL PMPAT 

= OP 

’ ( ' 

POL(X) POL(Y) ’)' 

+ 



:S(RETURN) 

POL MDPAT 

= OP 

' ( ’ 

POL(X) POL(Y) ')’ 

+ 



:(RETURN) 

END 




INPIX PORM 



PREPIX PORM 

((A*(B+C))-(D/E)) 



-(♦(A,+(B,C)),/(D,E)) 

A-B-C-D-E-P*G 



-(-(-(-(-(A,B),C),D),E),*(P,G) 

((A+(B*G))-(D/P)) 



-(+(A,*(B,G)),/(D,P)) 


The pattern STRIP removes the outer parentheses from the infix expression. The patterns 
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PMPAT and MDPAT separate the infix expression into two expressions and an operator 
according to the convention for left association. The patterns are identical except that 
PMPAT looks for addition or subtraction and MDPAT looks for multiplication or division. 

PMPAT has three parts, corresponding to the first balanced expression, the operator, and 
the second balanced expression. The pattern for the first expression is complicated by the 
fact that the operator must be the rightmost in the string of operators. Consider the 
pattern for the first expression: 

(ARBNO(BAL ANY('+-')) $ X PAIL | ♦DIPPER(X) 

+ TAB(*(SIZE(X) - 1))) . X 

It consists of two alternatives. The first, 

ARBNO(BAL ANY('+-')) $ X FAIL 

is used to locate the rightmost operator by matching a sequence of balanced strings 
followed by additive operators. FAIL forces ARBNO to match the longest such string and 
eventually causes failure of the alternative. Thus, for the expression A-B+C-D, the last 
match of the first alternative is 



On entry to the second alternative 

♦DIFFER(X) TAB(*(SIZE(X) - 1)) 

the value of X is checked to see if it is the null string. If so, no match is possible. If it is 
not null, the first balanced expression must be all but the last character of X. The first 
expression is matched by 

TAB(*(SIZE(X) - 1)) 

The remainder of PMPAT consists of the expression 
LEN(1) . OP REM . Y 

LEN ( 1 ) is used to match the operator and REM matches the remainder of the string, which 
is the second balanced expression. 
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4.6.4 Example - Tower of Hanoi 

The Tower of Hanoi is a game derived from the ancient Tower of Brahma, a ritual 
allegedly practiced by Brahman priests to predict the end of the world. At the time of 
creation, 64 golden discs of decreasing size appeared stacked on a diamond needle. Nearby 
were two other diamond needles, both empty. The Brahman priests, created at the same 
time, were set to the task of moving the discs from their original needle to a second 
needle using, when necessary, the third needle as temporary storage. Before all 64 discs are 
moved to the second needle and stacked in decreasing size, the end of the world will be 
upon us. 

Creation Intermediate End of 

Storage the World 
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Movement of the discs is governed by the rules: 

(1) only one disc may be moved at a time, 

(2) a disc may be moved from any needle to any other, and 

(3) at no time may a larger disc rest upon a smaller disc. 

A solution to the Tower of Hanoi is a recursive function that prints out the steps 
necessary to move N discs from one needle to another (where N is hopefully a good deal 
smaller than 64). A program that defines the function HANOI and tests it by moving 5 
discs from needle A to needle C follows. 

define!'HAN0I(N,NS,ND,NI)') :(HANOI.END) 

HANOI EQ(N,0) :S(RETURN) 

HANOI(N - 1,NS,NI,ND) 

OUTPUT = 'MOVE DISC ' N ' FROM ' NS ' TO ' ND 
HANOKN - 1,NI,ND,NS) ; (RETURN) 

HANOI.END 

TEST HAN0I(5,'A','C,'B') 

END 


MOVE 

DISC 

1 

FROM 

A 

TO 

C 

MOVE 

DISC 

2 

FROM 

A 

TO 

B 

MOVE 

DISC 

1 

FROM 

C 

TO 

B 

MOVE 

DISC 

3 

FROM 

A 

TO 

C 

MOVE 

DISC 

1 

FROM 

B 

TO 

A 

MOVE 

DISC 

2 

FROM 

B 

TO 

C 

MOVE 

DISC 

1 

FROM 

A 

TO 

C 

MOVE 

DISC 

4 

FROM 

A 

TO 

B 

MOVE 

DISC 

1 

FROM 

C 

TO 

B 

MOVE 

DISC 

2 

FROM 

C 

TO 

A 

MOVE 

DISC 

1 

FROM 

B 

TO 

A 

MOVE 

DISC 

3 

FROM 

C 

TO 

B 

MOVE 

DISC 

1 

FROM 

A 

TO 

C 

MOVE 

DISC 

2 

FROM 

A 

TO 

B 

MOVE 

DISC 

1 

FROM 

C 

TO 

B 

MOVE 

DISC 

5 

FROM 

A 

TO 

C 

MOVE 

DISC 

1 

FROM 

B 

TO 

A 

MOVE 

DISC 

2 

FROM 

B 

TO 

C 

MOVE 

DISC 

1 

FROM 

A 

TO 

C 

MOVE 

DISC 

3 

FROM 

B 

TO 

A 

MOVE 

DISC 

1 

FROM 

C 

TO 

B 

MOVE 

DISC 

2 

FROM 

C 

TO 

A 

MOVE 

DISC 

1 

FROM 

B 

TO 

A 

MOVE 

DISC 

4 

FROM 

B 

TO 

C 

MOVE 

DISC 

1 

FROM 

A 

TO 

C 

MOVE 

DISC 

2 

FROM 

A 

TO 

B 
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MOVE 

DISC 

1 

FROM 

C 

TO 

B 

MOVE 

DISC 

3 

FROM 

A 

TO 

C 

MOVE 

DISC 

1 

FROM 

B 

TO 

A 

MOVE 

DISC 

2 

FROM 

B 

TO 

C 

MOVE 

DISC 

1 

FROM 

A 

TO 

C 


The program logic can be seen by induction. Clearly, moving no discs requires no steps. 
Moving one disc from needle A to needle C requires one step. 

MOVE DISC 1 FROM A TO C 


Moving two discs from A to C requires three steps. 

MOVE DISC 1 FROM A TO B 

MOVE DISC 2 FROM A TO C 

MOVE DISC 1 FROM B TO C 


Moving three discs from A to C requires seven steps. 


MOVE DISC 1 
MOVE DISC 2 
MOVE DISC 1 
MOVE DISC 3 
MOVE DISC 1 
MOVE DISC 2 
MOVE DISC 1 


FROM 

A 

TO 

C 

FROM 

A 

TO 

B 

FROM 

C 

TO 

B 

FROM 

A 

TO 

C 

FROM 

B 

TO 

A 

FROM 

B 

TO 

C 

FROM 

A 

TO 

C 


The general solution is: 


MOVE N-1 DISCS FROM A TO B 
MOVE DISC N FROM A TO C 
MOVE N-1 DISCS FROM B TO C 


The implementation is simple. HANOI is defined with four arguments: 

(1) N is the number of discs to be moved, 

(2) NS is the starting needle, 

(3) ND is the destination needle, and 

(4) NI is the intermediate storage needle. 

On entry to HANOI , the value of N is compared with zero. If N is zero, no discs are 
moved and the function returns. If N is not zero, HANOI is called recursively to move N-1 
discs from the starting needle to the intermediate storage needle. Having done that, the 
command to move the Nth disc from the starting needle to the destination needle is 
printed. Finally, HANOI is called a second time to move the N-1 discs from intermediate 
storage to the destination needle. 
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Exercises 

Exercise 4.1: Write a function that prints all substrings of length N from a string S. 

Exercise 4.2: The Fibonacci numbers F(N) are defined by the equations 

F(0) = 0 
F( 1) = 1 

F(N+1) = F(N) + F(N-1) 

Define a function F such that the value of F ( N ) is the N th Fibonacci number. 

Exercise 4.3: Define a function that converts algebraic expressions into Polish notation. Let 
the definition of expression include the binary operator ! for exponentiation in addition to 
the operators and /. Assume that exponentiation associates to the right. 

Exercise 4.4: Define a function NOT such thatNOT(P) matches the null string if P fails to 
match, and fails ifP matches. 




CHAPTER 5 


Arrays, Tables, and Defined Data Types 


5.1 Arrays 

An array is an indexed aggregate of variables, called elements. Arrays are created by 
the execution of the primitive function ARRAY. ARRAY(p,e) returns an array whose 
bounds and dimensions are described by the prototype p. Every element is initialized to 
the value of the expression e . For example, 

VECTOR = ARRAY(10) 

assigns a one-dimensional array of length 10 to VECTOR. Since the second argument is 
omitted, each element of the array has the null string value. Indexing ordinarily starts at 
1 . Other lower bounds may be specified by using a colon to separate the upper and lower 
limits. 

LINE = ARRAY ('-5:5') 
creates an array with lower bound -5 and upper bound 5. 

Additional dimensions in a prototype are separated by commas. Thus, 

BOARD = ARRAY ( ' 3,3 ' , ' X' ) 
defines a three-by-three array with all elements having the value X . 


BOARD 
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There is no intrinsic limit on the size or dimensionality of an array. 

Warning: The first argument of ARRAY is the prototype, and the second is a value, 
which is given to each element of the resulting array. Thus, 

A = ARRAY('3,3') 

creates a two-dimensional array with each element having the null string as value. 



On the other hand, 

A = ARRAY(3,3) 

creates a one-dimensional array with each element having the value 3. 


A 



Each element 
instructions 


of an array is given the same initial value. Consequently, execution of the 


A1 = array!5) 

A2 = array! 5, A1) 

creates only two arrays. Each element of A2 has the same array, A1 , as value. 
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A2 A1 



5.1.1 Array References 

If the value of a variable is an array, as is the case with VECTOR, BOARD, A, A1 , and 
A2 above, an element in the array may be referenced through the variable. Angular 
brackets following the array-valued variable are used to specify the element. Array 
references such as VECT0R<8> or BOARD<2 , 3> are variables. For example, 

VECT0R<8> = EXP 

assigns the value of EXP to the eighth element of VECTOR. 

OUTPUT = B0ARD<2,3> 

prints the value of the 2,3 element of BOARD. 

FIELD = break!' ’) . LINE<-3,4> ’ ’ 

defines a pattern that breaks out a field of data and assigns it to the -3,4 element of 
LINE. 

I'ach element of an array may have any type of data object as value. For example, the 
first ^ement of an array may be an integer, the second a pattern, and so forth. 

If index referring to an element of an array falls outside the range of the array, the 
array reference fails. Thus, 

OUTPUT = VECTOR<12> 

fails. This failure may be used to control iteration through the elements of an array 
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without knowing its size, A function SUM , whose value is the sum of all the elements of an 
array, could have the defining statement 

define!'sum!arrayin’) 

with the procedure 

SUM N = N + 1 

SUM = SUM + ARRAY<N> :S!SUM)F!RETURN) 

The summation loop continues until N exceeds the range of ARRAY , This function does 
not need to know the size of ARRAY , but only that it is a one-dimensional array with a 
lower bound of one. 

Omitted trailing indices are taken to be zero. Specification of too many indices is an 
error. 


Example - Bubble Sort 


A simple application of one-dimensional arrays is illustrated in the following example 
that puts strings in lexical order. A bubble sort is much like an exchange sort. When two 
elements are found to be out of order, they are switched. However, the lexically smaller 
item is bubbled up to its proper place. 


* 




* 

READ 

* 

GO 

* 

PRINT 

* 

SORT 


SWITCH 


BUBBLE SORT PROGRAM 
STRIM = 1 

define!’sort!n)i’) 

DEFINE!'SWITCH!I)TEMP') 

define!'bubble!J)' ) 



GET NUMBER OF ITEMS TO BE 

SORTED 

N 

= INPUT 

:F!ERROR) 

A 

= ARRAY!N) 


I 

READ IN THE ITEMS 

= 1+1 


A<I> 

= INPUT 

:F!G0)S!RE. 

sort!n) 

M 

SORT THE LIST 

PRINT SORTED LIST 

= 1 


OUTPUT 

= A<M> 

:F!END) 

M 

= M + 1 

:!PRINT) 

I 

FUNCTIONS 

= lt!i,n - 1) 1+1 

:F!RETURN) 

lgt!a<i>. 

A<I + 1>) 

:P!S0RT) 

switch!!) 
bubble!I) 


:!SORT) 

TEMP 

= A<I> 


A<I> 

= A<I + 1> 
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BUBBLE 

A<I + 1> = 

J = 

TEMP 

GT(J, 1 ) 

J - 1 

:(RETURN) 
:F(RETURN) 

END 

LGT(A<J>,A<J 
SWITCH(J) 

+ 1>) 


:F(RETURN) 
:(BUBBLE) 


For the input 
15 

ADDSIB 

BUKINT 

ADJTTL 

BUCKET 

AUREAL 

BKSPCE 

APDSP 

ARRAY 

BKSIZE 

ALTERN 

BRANCH 

ADJUST 

BUFFER 

ADDSON 

ADDLG 

the output is 

ADDLG 

ADDSIB 

ADDSON 

ADJTTL 

ADJUST 

ADREAL 

ALTERN 

APDSP 

ARRAY 

BKSIZE 

BKSPCE 

BRANCH 

BUCKET 

BUFFER 

BUKINT 


One iteration of SORT is: 



118 Arrays, Tables, and Defined Data Types 


5.2 


SWITCH BUBBLE INCREMENT 

I 



Elements above I are properly ordered. If elements at I and 1+1 are out of order, they 
are switched. The new element at I, which is B, is bubbled by means of interchanges to 
its proper place above I. I is incremented and the process continues. 


5.2 Tables 

A table is an aggregate of variables, similar to a one-dimensional array. However, a 
variable in a table can be referenced by any data object, unlike arrays that require integer 
references. A table can be thought of as an associative array. 

A table is created by the TABLE function. For example, 

T = TABLE() 

creates a table and assigns it as the value of T. Variables in the table T may be 
subsequently referenced in much the same manner that array references are made. For 
example, 

T<'A’> = 3 

assigns the value 3 to the A th element of T. 

The referencing argument can be any value of any data type. By simply using an 
argument, the appropriate variable in the table is referenced. If no such variable exists, 
one is created and given the null string as its initial value. 
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The TABLE function actually has two arguments, both of which can be omitted as in 
the introductory example above. The general form of the function is 

TABLE(N,M) 

where N and M concern the size of the table. N determines the initial size of the table, 
indicating how many variables it can contain. M is the number of additional variables 
provided if more are required. For example, 

table!20,15) 

creates a tables of 20 variables. If more are required, the table size is increased to 35. If 
this is not sufficient, the size is increased to 5 0 , and so on. 

The default values for N and M are 10. If either argument is omitted (or zero), the 
corresponding default is used. Efficient use of tables is obtained by chosing values of N 
and M appropriate to the expected use of the table. 

Programmers are cautioned that each different argument references a different variable 
in a table. T< 1 > and T<' 1 ' > reference different variables. Particular care must be taken 
when the referencing argument is not given explicitly, but rather results from the 
evaluation of an expression. 

An example of the use of tables appears at the end of the next section. 


5.3 Functions for Use with Arrays and Tables 
5.3.1 COPY 

The value of the ARRAY function is an object whose data type is ARRAY. This value 
may be assigned to one or more variables. 

A = ARRAY(3) 

B = A 

A and B have the same array as value. 



B 
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Thus, 

B<2> = 'SIX’ 

OUTPUT = A<2> 

print SIX. 


A-> 


B 




^SIX 


The COPY function produces a copy of an array. Executing the statements 

A = ARRAY(3) 

A<2> = ’TWO’ 

B = COPY(A) 

B<2> = 'SIX' 


creates distinct arrays. Unlike the previous example, assigning a value to B<2> does not 
affect the value of A<2>. 


B 


TWO 


->SIX 


COPY may be used with other types of data, as described in Chapter 7. Tables cannot be 
copied, however. 


5.3.2 PROTOTYPE 

The value of the dimension or range of an array is sometimes needed. The primitive 
function PROTOTYPE returns the prototype used to define the array. PROTOTYPE has an 
array-valued argument and returns the prototype string. Thus, if 

A = array!'-5:5','X') 
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then the value of PROTOTYPE (A) is the string -5:5. 

An example utilizing PROTOTYPE is the following function named SQUARE . The 
argument of SQUARE is any singly-dimensioned array. The value of SQUARE is a 
2-dimensional square array whose dimensions equal that of the argument, and whose 
elements are null strings. 

DEFINE('SQUARE(A)') :(SQEND) 

SQUARE SQUARE = ARRAY(PROTOTYPE(A) PROTOTYPE(A)) ;(RETURN) 

SQEND 

The argument of ARRAY is a string formed from two occurences of PROTOTYPE (A) 
separated by a comma. Thus, the index range is the same for both dimensions of the new 
array. 


5.3.3 ITEM 

In order to reference an array or table element by means of angular brackets, the array 
must be the value of a known identifier. Sometimes this is not the case. For example, 

$X = ARRAY(10) 

is an acceptable assignment statement. But $X<2> and ($X)<2> do not reference the 
second element of the array. In the first expression, the unary $ operates on the value of 
X<2>; the second is syntactically erroneous. 

There are two ways to refer to an element of such an array. The array can be assigned 
to a known identifier: 

TEMP = $X 

TEMP<2> = 'SIX' 

Alternatively, the primitive function ITEM can be used. The value of 
ITEM ( A , i i „ ) is the ( i i q ) element of the array A . 

ITEM($X,2) = 'SIX’ 

assigns SIX to the second element of the array that is the value of $X. 

Similarly, if 

A<1> = array!100) 

is executed, the 50th item of this array may be referenced by ITEM(A<1>, 50) . 

ITEM also may be used to reference tables. 

If an index referring to an element of an array falls outside the range of the array, the 
call of ITEM fails. As with array and table references, extra arguments are erroneous. 



122 Arrays, Tables, and Defined Data Types 


5.3.4 


5.3.4 Conversion between Arrays and Tables 

Conversion between tables and arrays can be performed using the CONVERT function. If 
T is a table 

A = CONVERT (T, ' ARRAY ' ) 

assigns to A an array corresponding to the table T . The prototype of A is n, 2 where n is 
the number of variables in T that have nonnull values. A<I, 1> is a referencing argument 
and A<I, 2> is the value of the corresponding variable. Only variables with nonnull 
values are included. Conversion from TABLE to ARRAY fails if there is no variable with a 
nonnull value. 

Conversion of a table to an array creates a ‘fixed’ representation of the table that can 
be accessed systematically by indexing through the resulting array. This ability is 
particularly useful when the variables of the table are not known, as illustrated by the 
example at the end of this section. 

A rectangular array with a second dimension that has an extent of two can be 
converted to a table. For example, if A is an array with prototype -3:3,2, 

T = CONVERT(A,'TABLE') 

creates a table T with seven variables corresponding to A<I,1> (-3 < I < 3) and with 
values of these variables being A<I, 2> respectively. A<I,1> is created even if the value 
of A<I, 2> is the null string. If the second dimension does not have an extent of two, the 
call of CONVERT fails. 

The value used for additional variables (corresponding to the second argument of 
TABLE ) is the default, 10 . 


Example - Word Counting 

A simple application of tables is illustrated in the following example that counts the 
number of times that words occur in input text. A line of text is read in, printed, and 
decomposed into words, using a simple definition for word separation. Each word 
references a corresponding table variable used to keep the count. When the input text is 
exhausted, the table is converted into an array and the results printed. 

SANCHOR = 1 

SEPARATOR = ' 

END = BREAK(S EPARATOR) 

GAP = SPAN(SEPARATOR) 

TOKEN = END . WORD GAP 

COUNT = TABLE(20,10) 

READ LINE = INPUT :F(PRINT) 

OUTPUT = LINE 
LINE GAP 
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NEXTT 

LINE 

TOKEN 

= 

:F(READ) 

PRINT 

COUNT<WORD> 
OUTPUT = 

OUTPUT = 

OUTPUT = 

= COUNT<WORD> + 1 

'WORD COUNTS ARE:' 

:(NEXTT) 


COUNT 

I = 

1 

CONVERT(COUNT,'ARRAY') 

:F(END) 

NEXTC 

OUTPUT 

= 

COUNT<I,1> ' = ' COUNT<I,2> 

:F(END) 

END 

I = 

I + 

1 

:(NEXTC) 


For input consisting of a fragment from the poem ‘The Bells’ by Edgar Allan Poe, the 
output is 


ON THE FUTURE! - HOW IT TELLS 
OF THE RAPTURE THAT IMPELLS 
TO THE SWINGING AND THE RINGING 
OF THE BELLS, BELLS, BELLS, - 
OF THE BELLS, BELLS, BELLS, BELLS, 

BELLS, BELLS, BELLS, - 
TO THE RHYMING AND THE CHIMING OF THE BELLS! 

WORD COUNTS ARE: 

ON = 1 
THE = 9 
FUTURE = 1 
HOW = 1 
IT = 1 
TELLS = 1 
OF = 4 
RAPTURE = 1 
THAT = 1 
IMPELLS = 1 
TO = 2 
SWINGING = 1 
AND = 2 
RINGING = 1 
BELLS = 11 
RHYMING = 1 
CHIMING = 1 


5.4 Programmer-Defined Data Types 

SNOBOL4 allows the programmer to define his own types of data objects. A 
programmer-defined data object is an ordered set of variables called fields. A call of 
DATA(p) defines a new data type described by the prototype p. The prototype p is a 
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String denoting the name of the data type and the names of the fields. There is no 
intrinsic limit to the number of fields. 

As an example of a programmer-defined data type, consider complex numbers. A 
complex number can be said to consist of two fields, the real and the imaginary. The call 

DATA('C0MPLEX(R,I)') 

defines a data type COMPLEX, with two fields R and I. 

To create an object with the data type COMPLEX, a call of the form 

complex! e1,e2) 

is made, where el and e2 are expressions with numerical values. For example, to assign 
the complex number ‘1.5+2 . Oi ’ to the variable C, the statement 

C = complex!1.5,2.0) 

is executed. Each call of the function COMPLEX creates two new variables corresponding 
to the real and imaginary parts. These variables may be referenced by using the field 
name as a function. After executing the statement above, the value of C is a complex 
number; the real part is referenced by R! C) and the imaginary part by I! C) . Thus, 

A = R!C) 

assigns the value 1.5 to A . Since R ! C ) is a variable, it may be assigned a value. If 
r!C) = 3.2 

is executed, the complex number ‘3.2 + 2.0i’ is assigned toC. 

Operations on complex quantities can be defined using programmer-defined functions. A 
function to compute the sum of two complex quantities is 

define!'SUM!C1,C2)') ;!SUM.END) 

SUM SUM = COMPLEX!r!C 1) + R!C2),I!C1) + l!C2)) :!RETURN) 

SUM.END 

If C has the value ‘3.2+2 . Oi ’, execution of the statement 

c = sum!c,complex! 1.0,1.0)) 
assigns ‘4.2+3.0 i ’ to C. 


Example - Text Processing 

There is no intrinsic limit to the length of a string in SNOBOL4, but there is often a 
practical limit. For example, scanning a string for a pattern can be time consuming if the 
string is long. However, many string applications require reading in and retaining long 
passages of text. For such cases, a new data type called TEXT can be defined. 

data!'text!line,n,next)') 
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The first field is a line of text, the second field indicates the line number, and the third 
field points to the next line of text. 

A passage of text is read as follows: 

I = 1 

HEAD = TEXT(INPUT,I) :F(EMPTY) 

CURRENT = HEAD 
LOOP 1=1+1 

NEXT(CURRENT) = TEXT(INPUT,I) :F(D0NE) 

CURRENT = NEXT(CURRENT) :(LOOP) 

DONE 

The resulting data structure has the form: 


HEAD 



LINE 

N 


line 3 


3 


NEXT 
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The statement 

LINE(HEAD) 'EVERY' :S(YES)F(N0) 

examines the first line for the word EVERY. 

The following section of program prints the lines and line numbers where EVERY 
occurs. 

CURRENT = HEAD 

TEST LINE(CURRENT) 'EVERY' :F(BUMP) 

OUTPUT = N(CURRENT) ': ' LINE(CURRENT) 

BUMP CURRENT = NEXT(CURRENT) 

IDENT(CURRENT) ;P(TEST) 

The same field names may exist for several data types. Thus, 

DATA('LlST(VALUE,NEXT)') 

defines a data type LIST which can coexist with the previous definition of the data type 
TEXT. Although NEXT is a field name for both TEXT and LIST, NEXT(X) is not 
ambiguous because the data type of the argument X indicates the usage. 


5.4.1 VALUE 

VALUE is a primitive field function defined on strings and names which refers to their 
value. If 


RADIX = 'HEX' 


then 

V = VALUE('RADIX') 
assigns the string HEX to V . Similarly, 

value!'RADIX') = 'DEC' 

assigns the string DEC as the value of RADIX. 

VALUE is supplied so that a programmer may define the field VALUE on programmer- 
defined data types, and then apply VALUE to strings and names as well as the defined 
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types. This permits a uniform treatment of ‘value’ without the necessity for checking data 
type. If 

DATA('LlST(VALUE,TEXT)') 

DATA('NODE(FATHER,LSON.RSIB,VALUE)') 

are used to define the data types LIST and NODE, then VALUE can be applied to objects 
with data type LIST and NODE as well as names and strings. 


Exercises 

Exercise 5.1: One way to represent a directed graph is with an array of arrays. Suppose 
the nodes of a graph are numbered 1 through N, and node I has S ( I ) successors. The 
graph can be represented by an N*2 array. The 1, 1 element is a one-dimensional array 
of successors with S ( I ) elements. The 1, 2 element can be used for a value associated 
with the I th node. 

Write a program to read in and construct a graph. The first data card should contain 
the number of nodes. Each subsequent card should contain ‘a node number and a list of 
successors. 

Exercise 5.2: A path between two nodes can be represented by a string of node numbers 
separated by commas. Write a function PATH ( I, J ) that finds a path between the nodes 
I and J. If there is no path, the function should fail. 

Exercise 5.3: Assume the statements in an assembly language program have three fields: 
label, operation, and operand. They are in free format and the label field is optional. The 
branch statement has the form 

[label! BRANCH label 

Write a program that analyzes an assembly language source deck and produces a flow 
table that lists the labels and the numbers of the statements that branch to each label. 

Exercise 5.4: Define a function MUL that computes the product of two complex numbers. 
Redefine multiplication so that if either operand is complex the result is complex. 
Otherwise, the product should be the usual result. 




CHAPTER 6 


Keywords, Names, and Code 


6.1 Keywords 

The unary operator ampersand, fi, when applied to certain identifiers called keywords, 
provides access to information used internally by the SNOBOL4 system. Some keywords 
permit internal values to be changed. Thus 

6ANCH0R = 1 

turns on the anchored pattern matching mode. Another example of a keyword is 
6STC0UNT whose value is the number of statements that have been executed. If the 
statement 

GT(6STC0UNT,40000) :S(CLEAN.UP) 

is executed, transfer to the statement labelled CLEAN.UP is made if more than 40000 
statements have been executed. The value of fiSTCOUNT is automatically incremented as 
statements are executed. The programmer cannot assign a value to &STCOUNT, and an 
attempt to do so is an error. Such a keyword is protected. Keywords to which the 
programmer can assign values (such as £ ANCHOR) are unprotected 


6.1.1 Protected Keywords 

There are fifteen protected keywords. 

SALPHABET: The value of £ALPHABET is a string of all the characters of the machine 

on which SNOBOL4 is implemented. The characters are ordered according to their 
internal coding. 

£ABORT : The value of fiABORT is the same as the initial value of ABORT . The value of 

ABORT can be changed, but the value of fiABORT cannot. There are corresponding 
protected keywords for the other primitive patterns. 

fiARB : The value of fiARB is the same as the initial value of ARB. 

£BAL: The value of fiBAL is the same as the initial value of BAL. 

fiERRTYPE: When a program error occurs, an integer code identifying the error is 
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assigned to SERRTYPE. See the discussion of errors in Chapter 10. The initial value of 
6ERRTYPE is zero, indicating no error has occured. 

£FAIL: The value of SFAIL is the same as the initial value of FAIL. 

£ FENCE; The value of £ FENCE is the same as the initial value of FENCE. 

£FNCLEVEL: The value of £FNCLEVEL is the level of programmer-defined function calls. 

The initial value of £FNCLEVEL is zero. £FNCLEVEL is incremented by one each time a 
programmer-defined function is called, and decremented with each return. 

£LASTNO: The compiler assigns a number to each statement. These numbers are used 

primarily for diagnostic purposes. The value of £LASTNO is an integer corresponding to 
the number of the last (previous) statement executed. See also £STNO. 

£REM: The value of £REM is the same as the initial value of REM. 

£RTNTYPE; The value of £RTNTYPE is the string RETURN, FRETURN, or NRETURN, 
corresponding to the type of return made by the last programmer-defined function to 
return. The initial value of £RTNTYPE is the null string. 

£STCOUNT; The value of £STCOUNT is an integer corresponding to the number of 
statements executed. The value of £STCOUNT is incremented when execution of a 
statement begins. 

£STFCOUNT; The value of £STFCOUNT is an integer corresponding to the number of 
statements that have failed. £STFCOUNT is incremented when a statement fails. 

£STNO: The value of £STNO is an integer corresponding to the number of the statement 

currently being executed. See £LASTNO. 

£ SUCCEED: The value of £ SUCCEED is the same as the initial value of SUCCEED. 


6.1.2 Unprotected Keywords 

There are thirteen unprotected keywords to which a programmer can assign integer 
values. 

£ ABEND; If £ ABEND is nonzero, indicating a request for an abnormal ending, a system 
core dump is given following program termination. When an abnormal end is forced in 
this manner on the IBM 360, the user completion code is the value of £ABEND (modulo 
4096). 

£ ANCHOR: If £ ANCHOR is nonzero, pattern matching is performed in the anchored mode. 

See Chapter 2. The initial value of £ANCHOR is zero. 

£CODE: The value of £CODE is returned to the operating system on program completion. 

On the IBM 360, the value of £CODE is returned as the condition code for that job step. 
The initial value of £CODE is zero. 
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SDUMP: If &DUMP is nonzero at program termination, the values of natural variables and 

unprotected keywords are listed at the end of the program printout. The initial value of 
£DUMP is zero. 

SBRRLIMIT: The value of SERRLIMIT controls the handling of certain program errors. 

See the description of program errors in Chapter 10. The initial value of SERRLIMIT is 
zero. 

6FTRACE; If SFTRACE is greater than zero, diagnostic tracing information is provided 
on calls to and returns from all programmer-defined functions. See Chapter 8. The initial 
value of SFTRACE is zero. 

SFULLSCAN: If SFULLSCAN is nonzero, pattern matching is performed in the fullscan 

mode. See Chapter 2. The initial value of SFULLSCAN is zero. 

6 INPUT: If £ INPUT is nonzero, input occurs automatically for input-associated variables 
(such as INPUT). If £ INPUT is zero, input does not occur. The initial value of £ INPUT is 
one. 

£MAXLN6TH: The value of £MAXLNGTH limits the length of strings that can be formed. If 

an attempt is made to form a string with more characters than £MAXLNGTH, a program 
error occurs. The initial value of £MAXLNGTH is 5000, but this value can be changed. All 
types of string formation are governed by £MAXLNGTH : concatenation, replacement, value 
assignment as a result of pattern matching, and string input. 

£OUTPUT : If £OUTPUT is nonzero, output occurs automatically whenever a value is 

assigned /to an output-associated variable. If £ OUTPUT is zero, output does not occur. The 
initial value of £ OUTPUT is one. 

£STLIMIT: The value of STLIMIT is the limit on the number of statements that can be 

executed (see £STCOUNT). The initial value of £STLIMIT is 50000. Exceeding the limit 
on statement execution causes program termination. 

£ TRACE: Diagnostic tracing facilities are available if £ TRACE is greater than zero. See 

Chapter 8. The initial value of £ TRACE is zero. 

£TRIM: If £TRIM is nonzero, trailing blanks are trimmed on input. The initial value of 

£TRIM is zero. 


6.2 Names 

A variable can be assigned a value during an assignment statement or by pattern 
matching through use of the unary cursor position operator a or the binary value 
assignment operators . and $. In SNOBOL4, variables fall into two major classes, natural 
variables and created variables. 

A natural variable is any variable whose name is a nonnull string. Thus, 
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A 

$ 'AB' 

are examples of natural variables, whose names, respectively, are the strings 

A 

AB 

,.,i 

The variable , , , ( cannot appear explicitly in an assignment statement such as 

( = 'X' 

because it is syntactically incorrect. However, 

= ’X’ 

is syntactically correct and performs the desired assignment. Every string except the null 
string is the name of a natural variable. Natural variables are available at the start of a 
program without any conscious act of creation on the part of the programmer. All natural 
variables with the exception of ABORT, ARB, BAL, PAIL, FENCE, REM, and SUCCEED 
have the null string as their initial value. 

Created variables are generated during execution of a program when, for example, an 
array is created. The statement 

A = ARRAY(10) 

creates an array of ten variables. These variables are referenced by A<1>, A<2>, ..., 
A<10>. 

Some expressions yield variables when evaluated. Such variables are called generated 
variables, and values can be assigned to them in the same manner that values can be 
assigned to variables that appear explicitly. In the statements 

M = 2 

$('N' M) == 'INVOICE' 

the subject $( 'N' M) generates the variable N2 which is assigned the value INVOICE. 
Array and table references, field functions on programmer-defined data objects, and 
programmer-defined functions that return by NRETURN are examples of expressions that 
generate variables. 

Other expressions, for example arithmetic operations, yield values but not variables. 
Thus, execution of the statement 

(A + B) = 2 


is an error. 

Gotos require natural variables. These natural variables may also be generated. The 
indirect goto 


:S($TRIM(INPUT)) 
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is an example. 

Some expressions, such as indirect references, always yield variables. Others, such as 
literals, always yield only values. Some expressions may or may not yield variables. For 
example, 

F(X) = 2 

may or may not be erroneous, depending on the function P. To allow for such cases, the 
syntax of SNOBOL4 permits any kind of expression as the subject of assignment. 
Statements such as 

2 = 3 

are syntactically acceptable even though they are erroneous if executed. 


6.2.1 Passing Names 

Consider a function BUMP which increments the value of any variable by 1 . If the 
value of variable N is to be incremented, the call 

BUMP(N) 

is not suitable because the value of N, not the name N, is passed to the procedure for 
BUMP . The form of the call must be 

BUMP('N') 

which passes the string N to the BUMP procedure. Since the string N is the name of the 
variable N, indirect referencing may be used to increment the value. 

The defining statement and procedure for BUMP are: 

define!'bump(VAR)') 


BUMP $VAR = $VAR + 1 :(RETURN) 


6.2.2 The Unary Name Operator 

Suppose BUMP is to increment the value of a created variable, such as the second 
element of the array A. The call 

BUMP(A<2>) 

is not suitable, since only the value of A<2> is passed. The call 


BUMP('A<2>') 



6.2.3 


Returning a Variable 133 


is not suitable either, since the string A<2> is passed, and 
$'A<2>’ 

is a natural variable that bears no relation to the array element. The difficulty arises 
because there is no explicit name for created variables. However, implicit names for 
created variables can be obtained through use of a unary name operator. 

The unary name operator . applied to any variable returns as its value the name of 
that variable. Thus, the value of 

.A<2> 

is the created name of the second array element. The call 
BUMP(.A<2>) 

passes the name of the second array element to BUMP, so that incrementing is done 
properly. 

The name operator serves much the same purpose for created variables as quotation 
marks do for natural variables. Furthermore, the name operator applied to a natural 
variable behaves the same as quotation marks. Thus, the value of 

.LINE 

is the string LINE. Both of the following pairs of statements assign the value 2 to MAY. 

WORD = 'MAY' 

IWORD = 2 

WORD = .MAY 
$WORD = 2 

If the argument of the name operator is a natural variable, the value returned by the 
name operator is a string which is an explicit name. If the argument of the name operator 
is a created variable, the value returned is an created name. If the argument is not a 
variable, an error occurs. For example, 

.SIZECX) 

. (A -f B) 

.+A 

are erroneous because the arguments are not variables. If A and B are integers or numeral 
strings, 

.$(A -I- B) 

is valid because $ (A -l- B) is a natural variable. 
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6.2.3 Returning a Variable 

When returning from a programmer-defined function via RETURN, the value of the 
function name becomes the value of the function call. If NRETURN is used, the value of 
the function name is returned as a variable, not as a value. Such a function call may thus 
be used freely in any context that requires a variable. 

Consider, for example, the function NEXT which returns the first unused element of an 
array. The array is given as an argument and is assumed to have a zeroth element which 
indicates the last used element. 

define!'next(a)') 

* 

NEXT A<0> = A<0> + 1 

NEXT = .A<A<0» 

* 

NEXT.END 

Thus, executing the four statements 

B = array!'0;100') 
next!b) = 'A' 

NEXT!B) = 'THE' 

'STILL' 'T' REM . NEXT!B) 

assigns to B<0> through B<3> values 3, A, THE, and ILL, respectively. 

When NEXT returns, the value of NEXT is .B<B<0», which is the name of the first 
available array element. NEXT!b) becomes the variable B<B<0». 


: !NEXT.END) 

: S!NRETURN)F!FRETURN) 


6.3 Gotos, Labels, and Code 

Flow of control is governed by unconditional, success, and failure gotos. In the goto 
field, variables indicate the next statement to which control is passed based on the 
outcome of the current statement. 

If a variable is used as a statement label, a label attribute pointing to the statement is 
assigned to the variable. This label attribute is independent of the value of the variable. 
Thus, a variable can be used in the label field and the goto field, as well as in the subject 
field of a single statement. The statement 

DELAY DELAY = LT!DELAY,N) DELAY + 1 :S!DELAY)F!ONWARD) 

is acceptable and unambiguous. 

If a variable has no label attribute, its use in a goto field is erroneous. 

It is possible, as illustrated in the next section, to change the label attribute of a 
variable. In this way, a particular label variable, such as that appearing in 
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:S(LOOP) 

may cause transfer to one statement at the beginning of execution and an entirely different 
statement later on. 


6.3.1 Creation and Execution of Code 

In the first phase of a SNOBOL4 run (compilation), the source program is converted 
into Polish-prefix object code. In the second (execution) phase this object code is 
interpreted. Object code is a type of data just as are strings, patterns, and arrays. During 
the execution phase, it is possible, using the primitive function CODE (string) , to convert 
a string of characters into object code. The argument to CODE is a string representing one 
or more SNOBOL4 statements. The value of a call to CODE is executable object code. 

A string to be compiled into object code consists of SNOBOL4 statements separated by 
semicolons. For example, if the variable GET has a string value assigned by 



GET = ' 

STRIM 

= 1 ? ’ 



+ 

1 

N = 

10; ' 



+ 

1 

LINE 

= . * 
t 



+ 

'LOOP 

N = 

GT(N,0) 

N - 1 

:P(OUT);' 

+ 

1 

LINE 

= LINE 

INPUT 

;(LOOP)' 


then 

NUCODE = CODE(GET) 

causes the statements in the value of GET to be compiled. The value of CODE (GET) 
becomes the value of NUCODE. 

Execution of statements in the value of NUCODE can be accomplished in two ways: 

(1) transfer to a labelled statement appearing in NUCODE, and 

(2) execution of a direct goto which passes control to the first statement in NUCODE, 
whether labelled or not. 

Thus, execution of the goto 

;(LOOP) 

causes transfer to the statement labelled LOOP inside of NUCODE , even if the original 
program had a statement labelled LOOP. 

Blanks are as important in strings to be converted to code as they are in the program 
itself. A statement without a label must begin with a blank. 

A direct goto is a special construction in the goto field, permitting transfer directly to 
the beginning of a block of object code rather than through a label. The direct goto uses 
enclosing angular brackets rather than parentheses. The expression enclosed in the angular 
brackets must be code-valued. Execution of the direct goto 
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;<NUCODE> 

causes transfer to the first statement 
6TRIM = 1 

Flowing off the end of a block of compiled object code results in normal termination, just 
as if there were an end statement. If the label END is encountered in the string being 
compiled, compilation ceases at that point. 

The following statement illustrates the use of the function CODE in the goto itself. 

:<CODE(' OUTPUT = "RECOMPILED" :(RESTART)')> 

The angular brackets indicate transfer to the beginning of the newly compiled block of 
CODE, which prints RECOMPILED and transfers to the statement labelled RESTART. 

The primitive function CODE fails if its argument has a syntactic error. 

It is a compilation error for the same label to appear more than once in the source 
program. Statements compiled using CODE , however, may have the same labels as 
statements compiled earlier. The label attribute for the corresponding variable references 
the new statement. For example, the following program segment is used to call a function 
PROCESS(N) with various values ofN. 

BEGIN N = 5 

LOOP N = LT(N,10) N + 1 

PROCESS(N) 

OUT 

NEWLOOP = 'LOOP N = GT(N,0) 

CODE(NEWLOOP) 


END 

Within the two-statement loop, PROCESS(N) is called with N having values 6, 7, 8, 9, 
1 0 before control passes to the statement labelled OUT . At that point, a new block of code 
is compiled consisting of the statement 

LOOP N = GT(N,0) N - 1 ;P(END) 

It is intended that PROCESS (N) be called for N with values 4,3,2, 1 , and 0 , but this is 
not the case. The original statement labelled LOOP is still in the program. It is not 
overwritten by the compilation. The label attribute of LOOP no longer points to it. The 
label attribute now points at the newly compiled statement. The new compilation is a 
second program which can freely communicate with the original. Execution of the 
program proceeds as if the following programs were compiled. 

BEGIN N = 5 

N = LT(N,10) N + 1 ;F(OUT) 

PROCESS(N) :(LOOP) 

OUT 


:F(OUT) 

;(LOOP) 

N - 1 ;F(END)' 

:(BEGIN) 
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NEWLOOP = 'LOOP N = GT(N,0) N - 1 ;F(END)' 

CODE(NEWLOOP) :(BEGIN) 


END 

LOOP N = GT(N,0) N - 1 ;F(END) 

END 


After compilation of NEWLOOP , transfer to BEGIN causes N to be assigned the value 5. 
Control flows into the statement originally labelled LOOP , which increments N to 6. 
PROCESS(N) is called and, on completion, control passes to the new statement labelled 
LOOP. N is decremented to 5, but PROCESS cannot be called as intended, since the new 
statement does not overwrite the old, and no way is provided for getting back to the 
original program. 


The program segment can be rewritten to perform as intended by using explicit gotos to 
control program flow rather than relying on the sequence of statements to control flow. 


BEGIN 

LOOP 

PROC 

OUT 


N = 5 

N = LT(N,10) N + 

PROCESS(N) 


(LOOP) 
F(OUT) 
(LOOP) 


NEWLOOP = ’LOOP N 

CODE(NEWLOOP) 


GT(N,0) N - 1 ;F(END)S(PROC)' 

:(BEGIN) 


END 


Following compilation of NEWLOOP , execution proceeds as if the following programs 
were compiled. 


BEGIN 

PROC 

OUT 


N = 5 

N = LT(N,10) N + 

PROCESS(N) 


(LOOP) 
F(OUT) 
(LOOP) 


NEWLOOP = 'LOOP N 

CODE(NEWLOOP) 


GT(N,0) N - 1 ;F(END)S(PROC)’ 

:(BEGIN) 


END 

LOOP N = GT(N,0) N - 1 :F(END)S(PROC) 

END 
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After assigning 5 toN, control passes from the statement labelled BEGIN to the new 
statement labelled LOOP. N is properly decremented to 4 and control passes to the 
statement labelled PROC which calls PROCESS. The loop continues until N is 0 . 


Exercises 

Exercise 6.1: A pushdown list can be implemented by means of an array. Let the first 
element in the array be the index of the top of the list. Define three unary operators that 
push onto the list and return a variable, pop from the list and return a value, and return 
a variable at the top of the list. 

Exercise 6.2: Write a program that reads in another SNOBOL4 program, forms it into a 
string of statements separated by semicolons, converts it to code, and executes it. Assume 
the program to be read in is terminated by a card containing EOF , rather than an actual 
end of file. Test this program by using a copy of the program as input data. 
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Most programs do not require any particular attention to types of data. Any variable 
can have any type of data as value, and most conversions between data types are 
performed automatically, without the need for any action on the part of the programmer. 

In some situations, however, it is necessary to give more attention to data types. Explicit 
conversion from one data type to another may be necessary. Sometimes the details of the 
automatic, implicit conversions must be understood. This chapter covers these matters in 
some detail. 


7.1 Data Type Representations 

Data objects are classified by type. The string used to refer to the data type is called 
the formal identification of the data type. The types of data are: 


Data Type 


Formal Identification 


string 

integer 

real number 

pattern structure 

array 

table 

created name 
unevaluated expression 
object code 
programmer-defined 
external 


STRING 

INTEGER 

REAL 

PATTERN 

ARRAY 

Table 

NAME 

EXPRESSION 

CODE 

Data type name (e.g. COMPLEX) 
EXTERNAL 


For notational convenience, the formal identification is sometimes abbreviated to the first 
letter, with D standing for defined data types and X standing for EXTERNAL. 

The function DATATYPE can be used to obtain the formal identification of the data 
type of an object. The call DATATYPE (object ) returns the formal identification. For 
example. 
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DATATYPE(LEN(1)) 

returns the value PATTERN . Similarly, 

datatype!37) 
datatype!.A<I>) 

DATATYPE!*P) 


return 


INTEGER 

NAME 

EXPRESSION 

respectively. 

The DATA function can be used to define a data type with the same formal 
identification as another data type. For example, 

data!'string!first,last)') 

defines a new data type with formal identification STRING. The two data types STRING 
are correctly distinguished by the SNOBOL4 system, but they cannot be differentiated 
during program execution by their formal identifications. 


7.2 Explicit Conversion of Data Types 

In some cases, it is reasonable to convert an object of one data type into a 
corresponding object of another data type. This can be accomplished by using the 
CONVERT function, introduced earlier in connection with tables and arrays. 


7.2.1 CONVERT 

A call of the CONVERT function has the form CONVERT! object .datatype) where 
the object is to be converted to the given data type, expressed as a formal identification. 
For example, the real number 2.5 can be converted into an integer by the statement 

I = CONVERT!2.5,'INTEGER’) 

In REAL -to-INTEGER conversion, the fractional part is discarded, and I has the value 2 . 

Any data object can be converted to a string by CONVERT . In most cases conversion to 
STRING simply returns the formal identification of the data type of the object to be 
converted, and in this way behaves like DATATYPE. In four cases, INTEGER, REAL, 
ARRAY , and TABLE , special conversions are performed to provide more meaningful 
representations of the object converted. In general, the result of converting any data object 
to STRING is referred to as the string representation of that object. 
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CONVERT 
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In conversion from INTEGER to STRING, leading zeroes are suppressed and a sign is 
provided only if the integer is negative. Thus the value of 

convert!+00732,'STRING') 
is the string 732. 

In conversion from REAL to STRING, at least one digit is provided before the decimal 
point. A sign is provided only if the real number is negative. 

In conversion from ARRAY to STRING, the formal identification ARRAY is given, 
followed by the array prototype enclosed in parentheses. For example, if an array A is 
created by the statement 

A = ARRAY('0;10' , 3) 
the value of 

CONVERT(A,'STRING') 
is ARRAY ('0:10'). 

If the prototype is longer than 20 characters, only the formal identification, ARRAY, is 
given. 

In conversion from TABLE to STRING the formal identification, TABLE, is given, 
followed by the current size of the table and its secondary extent enclosed in parentheses. 
For a typical table, T , the value of 

CONVERT(T,'STRING') 
might be TABLE(300,10) . 

The following table summarizes the data type conversions that can be performed using 
CONVERT. 


Data Type of Argument 

Data Type of Object Returned 

S IRPATNE 

C D X 

STRING 

+ 

F 

F 


F 

F 

INTEGER 

S 

* 

C 




REAL 

S 

F 





PATTERN 

I 



* 



ARRAY 

s 



* 

F 


TABLE 

s 



F 

+ 


NAME 

I 




* 


EXPRESSION 

I 




* 


CODE 

I 






defined type 

I 





(♦) 

external type 

I 





(♦) 
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In this table, the symbols have the following meaning: 

(1) ♦ indicates that no conversion is performed, and the first argument is simply 
returned as value. In the case of defined and external data types, the enclosing 
parentheses indicate that conversion is successful only in converting an object to its 
own data type. An object of one defined or external type cannot be converted to a 
different type. 

(2) P indicates the conversion may fail. For example, the string A cannot be 
converted to an integer. 

(3) I indicates the formal identification is returned. 

(4) C indicates the conversion always succeeds. 

(5) S indicates the string representation is returned. 

If an attempt is made to perform a conversion that is not available, the CONVERT 
function fails. 


7.3 Data Types of Functions and Operations 

In many contexts, conversion from one type of data to another is performed implicitly 
as required by the functions and operations involved. A typical example is the conversion 
of strings to integers in arithmetic contexts. Another example is the conversion of any data 
type to its string representation on output. 

Unlike many programming languages, SNOBOL4 permits any variable to have any type 
of data as value (keywords are the only exception). Furthermore, any expression, however 
complicated, can appear wherever a value is required. It is the result of evaluating an 
expression that is important. For example, the two statements 

X = SIZE('ABCDEP') 

X = SIZEf'A' 'BCD' 'EP') 

produce the same effect since in both cases the result of evaluating the argument of SIZE 
is the string ABCDEP . The way the string is created is irrelevant, but it is important that 
the argument be a string. Throughout the language, values may be constructed in any 
way, but individual functions and operations often require their arguments to have certain 
data tyf)es. 
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To simplify programming, automatic data type conversion is performed in many 
circumstances. Consider, for example, the statement 

Y = SIZE(1376) 

Here the argument of SIZE is an integer. This integer is automatically converted to a 
string, however, and the value 4 is assigned to Y . 

In most cases, the problem of data type conversion can be ignored since conversion is 
automatically provided in the most common situations. Care must be observed, however, 
in the use of I DENT and DIFFER where no conversions are performed. There are 
occassional hazards of conversion. For example, the value of 

SIZE(+00732) 

is 3 because of the normalization performed during conversion, as described in a 
preceding section. 

Specific information concerning expected data types and automatic conversions is given 
in the following sections. 


7.3.1 Primitive Functions 

The following table lists all primitive functions, including some described in later 
chapters. This table indicates the data types expected for each argument and the data 
type of the value returned. The following notation is used. 

(1) Upper case letter abbreviations are used for formal identification of data types. 
For example, S stands for STRING. 

(2) The letter x indicates that any data type is possible. 

(3) The letter y indicates several data types are possible. Refer to the description of 
the individual function for details. 

(4) The letter n indicates a number, either integer or real, is possible. 

(5) The symbols *8 indicate the null string. 

(6) The letter v indicates a variable. Where a variable results from return by name, 
the resultant data type is indicated. For example x-^v indicates a variable whose 
value may be of any data type. 
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Function Call 

Return 

Function Call 

Return 

ANY ( S or E) 

P 

input! S orN,I,l) 

•S 

APPLy(S,x,...,x) 

X 

integer! x) 

•S 

ARBNO(P) 

P 

item!a,i,...,I) 

x-^v 

ARG(S,I) 

S 

item!t,x) 

x-^v 

ARRAY(S, X ) 

A 

LE!n,n) 

•s 

BACKSPACE(I) 

•S 

LEN!i or E) 

p 

BREAK (S or E) 

P 

LGT ! S,S ) 

•s 

clear! x) 

• S 

LOCAL ! S,I ) 

s 

CODE ( S ) 

c 

LT!n,n) 

•s 

collect!I) 

I 

NE ! n,n ) 

•s 

CONVERT !X ,S ) 

X 

N0TANY!S or E) 

p 

COPY!y) 

y 

opsyn!s,s,i) 

•s 

data!s) 

•s 

output!S orN,I,S) 

•s 

DATATYPE ! x ) 

s 

POS!l or E) 

p 

DATE !X) 

s 

prototype!A) 

s 

DEFINE!S,S) 

•s 

REMDR ! I,I ) 

I 

DETACH ! S or N ) 

•s 

REPLACE!S,S,S) 

s 

dipper! x,x) 

•s 

rewind!I) 

•s 

DUMP!I) 

•s 

RP0S!I or E) 

p 

dupl!s,i) 

s 

RTAB!I or E) 

p 

ENDFILE!!) 

•s 

size!s) 

I 

EQ!n,n) 

•s 

SPAN!S orE) 

p 

EVAL!S or E) 

X 

ST0PTR!S orN,S) 

•S 

EVAL!I) 

I 

tab! I or E) 

P 

EVAL!R) 

R 

TABLE(1,1) 

T 

field!s,i) 

S 

TIME!x) 

I 

GE!n,n) 

• S 

TRACE! S or N, S , S , S ) 

•s 

GT!n,n) 

• S 

trim!s) 

s 

ident!x,x) 

• S 

VALUE!S or N) 

X-V 


7.3.2 Unary Operators 

A unary operator is essentially the same as a function with one argument, but with a 
more compact syntactic representation. A table of expected data types and values returned 
for the defined unary operators follows. The previous notation is used with two additions: 

(1) s’ indicates strings that are protected keywords. 

(2) S^ indicates strings that are unprotected keywords. 
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7.3.3 Binary Operators 

A binary operator is essentially the same as a function of two arguments. A table of 
expected data types and values returned for the defined binary operators follows. 


Call 


Return 


(P $ v) P 

(P . v) P 

(n ♦♦ n) n 

(n ♦ n) n 

(n / n) n 

(n -i- n) n 

(n - n) n 

ix •S) X 

(•S x) X 

(S S) S 

(P P) P 

(P I P) P 


If either argument of concatenation is the null string, the other argument is returned as 
value. Automatic conversion of other data types is described in following sections. 
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7.3.4 Statement Components 

In addition to functions and operators, there are three other constructions in which data 
types are important: 

(1) assignment, 

(2) pattern matching, and 

(3) pattern matching with replacement. 

For assignment, the expected data types are simple: 

subject = object 

V = X 

6S^ = I 

Thus the only restriction on the object of assignment occurs when assignment is made to 
an unprotected keyword. The same restriction applies to assignment performed as a result 
of pattern matching in which components contain conditional or immediate value 
assignment operators. 

In pattern matching, the situation is even simpler. The expected data types are: 

subject pattern 

S S 

S P 


Pattern matching with replacement can be considered in three steps: 

(1) Pattern matching determines the string matched, and hence the initial and 
terminal unmatched parts of the subject string. 

(2) Concatenation of the initial part, the replacement object, and the terminal part 
forms a new object. 

(3) This new object is assigned to the subject variable. 

Thus, the many possible cases of replacement can be analyzed using the preceding 
information. A typical form of replacement is 

subject pattern = object 

S-V s = s 

s*-v P = s 

where the resulting data type of the subject is a string. A number of special cases can be 
determined also. For example, if the pattern matches the entire string, the initial and 
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terminal parts are null strings, and as described in the discussion of concatenation, the 
replacement object may have any data type. Similarly, if the subject is an unprotected 
keyword, the resulting replacement must be an integer. 


7.4 Implicit Conversion of Data Types 

The previous sections detail the data types expected by the various functions and 
operations. As indicated earlier, automatic conversion is provided in many contexts. The 
following sections describe such conversion in detail. 


7.4.1 Conversion to STRING 

Conversion to STRING is one of the most frequent of all implicit conversions. The 
string representation, which includes the conversion of all data types, is provided in the 
following situations; 

(1) On output. 

(2) Where required in trace printout (see Chapter 8). 

(3) In dumps of variables, both from DUMP(N) (see Chapter 8) and after program 
termination (see Chapter 10). 

INTEGER-to-STRING conversion is provided in the following situations; 

(1) Where STRING is an expected data type. 

(2) Where STRING is specified for the argument of an external function. 

REAL-to-STRING conversion is performed in a few contexts; 

(1) Where STRING is expected in concatenation, including concatenation performed 
in replacement statements. 

(2) Where STRING is expected as the subject of pattern matching. 

(3) Where STRING is specified for the argument of an external function. 
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7.4.2 Conversion to PATTERN 

Although there is no explicit conversion to PATTERN , implicit conversion occurs in 
many circumstances. Conversion from INTEGER, REAL, STRING, and EXPRESSION to 
PATTERN is provided wherever PATTERN is specified as an expected data type in the 
preceding sections. A typical example of conversion to PATTERN is the concatenation of a 
string with a pattern. 


7.4.3 Conversion to INTEGER 

STRING-to-INTEGER conversion is frequently required. It is provided in the following 
situations: 

(1) Where INTEGER is an expected data type. This includes arithmetic contexts that 
accept reals as well as integers. 

(2) Where INTEGER is specified for the argument of an external function. This 
conversion includes conversion of numeral string reals in addition to the regular 
conversion. 

REAL-to-INTEGER conversion is provided only for arguments of external functions. 


7.4.4 Conversion to REAL 

INTEGER-to41EAL and STRING-to-REAL conversion is provided in the following 
situations: 

(1) Where REAL is an expected data type. This includes arithmetic contexts that 
accept integers as well as real numbers. Both numeral string integers and numeral 
string reals are converted. 

(2) Where REAL is specified for an argument of an external function. 
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Tracing 


Tracing facilities are provided to permit the programmer to get diagnostic information 
about the execution of his program without interfering with its logic or structure. The 
tracing mode is turned on by assigning a positive integer to the keyword £ TRACE . When 
this mode is in effect, certain types of program actions can be sensed, causing 
corresponding messages to be printed. The types of actions sensed are; 

(1) change in the value of a variable, 

(2) call of a defined function, 

(3) return from a defined function, 

(4) transfer to a label, and 

(5) change in the value of certain keywords. 


8.1 Standard Trace Procedures 

The TRACE function is used to make specific trace requests. 

TRACE (name , type , tag) 

associates the name with the type of action for tracing purposes. The tag provides 
identifying information which is included in the trace printout if the name is not a 
natural variable. If the name is a natural variable, the tag is ignored. One trace 
association must be made for each name and type desired. 

Trace printout includes the statement number in which the action occurs, the result of 
the action, and the time of the action in milliseconds measured from the beginning of 
program execution. 

If £ TRACE is not positive, there is no tracing, even though trace requests have been 
made. The value of £ TRACE is decremented by one every time an action is traced, and 
tracing is automatically turned off when the value of £ TRACE reaches zero. Therefore the 
value assigned to £ TRACE may be chosen to limit the amount of trace printout. 
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8.1.1 Value Tracing 

TRACE (name,' VALUE ',tag) 

causes trace printout whenever the value of name is changed. Consider the following 
program. 

* 

* PRINT PERMUTATIONS OF SIZE N OF A GIVEN STRING 

* 

6TRIM = 1; £TRACE = 100 1 

TRACE(’CH',’VALUE'); TRACE(’STRING’,’VALUE’) 3 



DEFINE('PERM(STRING,N,HEAD)CH,USED') 


5 


PERMREQ = break!',') . STRING LEN(1) REM 

. N 

6 

READ 

CARD = INPUT 

; F!END) 

7 


CARD PERMREQ 

:F!ERROR) 

8 


OUTPUT = 'THE PERMUTATIONS OF ' STRING ' 

TAKEN ' N 

9 

+ 

' AT A TIME ARE:' 


9 


perm!STRING,N) 

:!READ) 

10 

PERM 

OUTPUT = EQ!N,0) head 

:S!RETURN) 

1 1 

PERMA 

STRING LEN!1) . CH = 

;F!RETURN) 

12 


USED = 


13 

+ 

PERM!STRING USED,N - 1,HEAD CH) USED CH 

:!PERMA) 

13 

ERROR 

OUTPUT = '♦♦♦ ERROR IN ' CARD 

:!READ) 

14 

END 



15 


The printed output for the input ABCD, 2 is 


STATEMENT 8: STRING = 'ABCD',TIME = 17 
THE PERMUTATIONS OP ABCD TAKEN 2 AT A TIME ARE; 


AB 


AC 


AD 


BC 


BD 


STATEMENT 

12: 

CH = 'A',TIME = 17 


STATEMENT 

12: 

STRING = 'BCD',TIME 

= 1 

STATEMENT 

12: 

CH = 'B',TIME = 34 


STATEMENT 

12: 

STRING = 'CD',TIME 

= 34 

STATEMENT 

12: 

CH = 'C,TIME = 50 


STATEMENT 

12: 

STRING = 'D',TIME = 

50 

STATEMENT 

12: 

CH = 'D',TIME = 50 


STATEMENT 

12: 

STRING = '',TIME = 

67 

STATEMENT 

12; 

CH = 'B',TIME = 67 


STATEMENT 

12: 

STRING = 'CD',TIME 

= 67 

STATEMENT 

12: 

CH = 'C,TIME = 84 


STATEMENT 

12: 

STRING = 'DA',TIME 

= 84 

STATEMENT 

12: 

CH = 'D',TIME = 100 


STATEMENT 

12: 

STRING = 'A',TIME = 

100 
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BA 


CD 


CA 


CB 


DA 


DB 


DC 


STATEMENT 

12: 

CH = 'A' 

.TIME = 100 


STATEMENT 

12: 

STRING = 

''.TIME = 

117 

STATEMENT 

12: 

CH = 'C 

.TIME = 117 


STATEMENT 

12: 

STRING = 

'D'.TIME = 

117 

STATEMENT 

12: 

CH = 'D' 

.TIME = 134 


STATEMENT 

12: 

STRING = 

'AB'.TIME 

= 134 

STATEMENT 

12: 

CH = 'A' 

.TIME = 150 


STATEMENT 

12: 

STRING = 

'B'.TIME = 

150 

STATEMENT 

12: 

CH = 'B' 

.TIME = 150 


STATEMENT 

12: 

STRING = 

''.TIME = 

167 

STATEMENT 

12: 

CH = 'D' 

.TIME = 167 


STATEMENT 

12: 

STRING = 

''.TIME = 

183 

STATEMENT 

12: 

CH = 'A' 

.TIME = 183 


STATEMENT 

12: 

STRING = 

'BC'.TIME 

= 183 

STATEMENT 

12: 

CH = 'B' 

.TIME = 200 


STATEMENT 

12: 

STRING = 

'C .TIME = 

200 

STATEMENT 

12: 

CH = 'C 

.TIME = 200 


STATEMENT 

12: 

STRING = 

''.TIME = 

217 


Value tracing is the default type of tracing, and value tracing is assumed if the second 
argument to TRACE is omitted. Thus TRACE ( ' CH ' ) and TRACE ( ' STRING ' ) would be 
sufficient in the example above. Value tracing occurs whenever a value is assigned to a 
traced variable by an assignment statement or as a result of value assignment in pattern 
matching. Value tracing does not occur when a value is assigned as the result of input or 
when values are assigned to formal arguments and local variables in the call of a 
programmer-defined function. 

If the name is not a natural variable, the tag is printed to identify the name being 
traced. For example, TRACE( .SUM<3>, 'VALUE' , 'SUM<3>' ) traces the third element of 
the array SUM. Here the tag SUM<3> (chosen to correspond to the created variable 
SUM<3>) provides a string that identifies the name of the trace request. As an example, 
consider the following program, which forms sums in several bins as given on data cards. 
The trace association must appear after creation of the array SUM, since the name 
. SUM<3> does not exist before the array is created. 


SANCHOR = 1 1 
6TRIM = 1 2 
STRACE = 1000 3 
CARDPAT = break!' ') . BIN SPAN!' ') REM . NUMBER 4 


* 

* 


THE FIRST CARD GIVES THE NUMBER OF BINS 
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5 


6 


7 

8 
9 


10 
1 1 
12 

13 

14 

For the input data 
10 

3 25 

1 27 

9 -75 
5 -1-65 

3 77 
7 -89 

2 75 

10 0 

3 -756 

7 499 
2 76 

4 23 

1 456 

5 87 

2 33 
10 23 

3 0025 

8 657 
3 +45 

the printed output is 

STATEMENT 9: SUM<3> = 25,TIME = 0 
STATEMENT 9: SUM<3> = 102,TIME =50 
STATEMENT 9: SUM<3>-654,TIME = 83 


* 

SUM = ARRAY(INPUT,0) :F(ERR) 

* 

* TRACE THE THIRD BIN. 

* 

TRACE(.SUM<3>,'VALUE’,’SUM<3>') 

* 

* SUBSEQENT CARDS CONTAIN A BIN NUMBER FOLLOWED BY A BLANK AND THEN 

* THE NUMBER TO BE ADDED TO THE BIN. 

* 

READ CARD = INPUT :F(DISPLAY) 

CARD CARDPAT :F(ERR) 

SUM<BIN> = SUM<BIN> + NUMBER ;S(READ)F(ERR) 

* 

♦ PRINT OUT THE SUMS 

* 

DISPLAY 

I = 1 

PRINT OUTPUT = ’SUM<' I ’> = ’ SUM<I> :F(END) 

I = 1+1 : (PRINT) 

END 
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STATEMENT 9: SUM<3> = -629,TIME = 166 
STATEMENT 9: SUM<3> = -584,TIME = 183 
SUM<1> = 483 
SUM<2> = 184 

SUM<3> -584 

SUM<4> =23 
SUM<5> = 152 
SUM<6> = 0 
SUM<7> =410 
SUM<8> = 657 

SUM<9> -75 

SUM<10> = 23 


8.1.2 Function Tracing 

There are three types of tracing for programmer-defined functions; CALL, RETURN, and 
FUNCTION. CALL and RETURN cause trace printout on the call to and return from a 
function. FUNCTION causes trace printout for both call and return. 

CALL tracing gives the level from which the call is made, the function name, and the 
value of its arguments. RETURN tracing gives the level to which the return is made. The 
following examples indicate the three types of tracing applied to a program that computes 
the number of combinations of N things taken M at a time. 


6TRACE = 1000 1 

TRACE('C’,'CALL') 2 

NM = break! ' , ’ ) . N ’ , ' break! ’ ' ) . M 3 

define!'c!n,m)') 4 

READ INPUT NM :F!END) 5 

OUTPUT = ’C!’ N ’,' M ')=' C!N,M) :!READ) 6 

* 

C M = LT!N - M,M) N - M 7 

C = EQ!M,0) 1 :S!RETURN) 8 

C = N ♦ C!N - 1,M - 1) / M :!RETURN) 9 

END 10 

For the input 15,6 the output is 

STATEMENT 6: LEVEL 0 CALL OP C!'15’,'6'),TIME = 16 


STATEMENT 9: LEVEL 1 CALL OF C!14,5),TIME = 50 

STATEMENT 9: LEVEL 2 CALL OF C!13,4),TIME = 50 

STATEMENT 9: LEVEL 3 CALL OF C!12,3),TIME = 66 

STATEMENT 9: LEVEL 4 CALL OF C!l1,2),TIME = 66 

STATEMENT 9: LEVEL 5 CALL OP C!10,1),TIME =83 

STATEMENT 9: LEVEL 6 CALL OP C!9,0),TIME = 83 
C!15,6)=5005 
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If TRACE ( ' C' , ' RETURN' ) is used instead, the output is 


STATEMENT 

8: 

LEVEL 

6 

RETURN 

OF 

C 

= 

1,TIME = 

34 

STATEMENT 

9: 

LEVEL 

5 

RETURN 

OF 

C 

= 

10,TIME = 

34 

STATEMENT 

9: 

LEVEL 

4 

RETURN 

OF 

C 

= 

55,TIME = 

50 

STATEMENT 

9: 

LEVEL 

3 

RETURN 

OF 

C 

= 

220,TIME 

= 50 

STATEMENT 

9: 

LEVEL 

2 

RETURN 

OF 

C 

= 

715,TIME 

= 67 

STATEMENT 

9: 

LEVEL 

1 

RETURN 

OF 

C 

= 

2002,TIME 

= 67 

STATEMENT 

9: 

LEVEL 

0 

RETURN 

OF 

C 

= 

5005,TIME 

= 67 


C(15,6)=5005 

and if TRACE ( 'C , 'FUNCTION' ) is used, the output is 


STATEMENT 6: 
STATEMENT 9; 
STATEMENT 9; 
STATEMENT 9; 
STATEMENT 9: 
STATEMENT 9: 
STATEMENT 9: 
STATEMENT 8; 
STATEMENT 9; 
STATEMENT 9; 
STATEMENT 9: 
STATEMENT 9: 
STATEMENT 9; 
STATEMENT 9; 
C(15,6)=5005 


LEVEL 

0 

CALL 

OF 

C 

LEVEL 

1 

CALL 

OF 

C 

LEVEL 

2 

CALL 

OF 

C 

LEVEL 

3 

CALL 

OF 

C 

LEVEL 

4 

CALL 

OF 

C 

LEVEL 

5 

CALL 

OF 

C 

LEVEL 

6 

CALL 

OF 

C 

LEVEL 

6 

RETURN 

OF 

LEVEL 

5 

RETURN 

OF 

LEVEL 

4 

RETURN 

OF 

LEVEL 

3 

RETURN 

OF 

LEVEL 

2 

RETURN 

OF 

LEVEL 

1 

RETURN 

OF 

LEVEL 

0 

RETURN 

OF 


'15','6'),TIME = 16 
14,5),TIME = 16 
13,4),TIME = 33 
12,3),TIME = 33 
11,2),TIME = 50 
10,1),TIME = 50 
9,0),TIME = 66 
C = 1,TIME = 66 
C = 10,TIME = 66 
C = 55,TIME = 83 
C = 220,TIME = 83 
C = 715,TIME = 83 
C = 2002,TIME = 99 
C = 5005,TIME = 99 


To facilitate the tracing of programmer-defined functions, the keyword 6FTRACE is 
provided. When 6FTRACE is a positive integer, all programmer-defined functions are 
traced on call and return. The value of SFTRACE is decremented by one each time a 
programmer-defined function is called or returns. When the value of £FTRACE reaches 
zero, function tracing stops. £ TRACE and £FTRACE are independent, and both may be 
used at the same time. The following program illustrates the use of £FTRACE. 

£FTRACE = 1000; £TRIM = 1 1 

* 

* THIS PROGRAM COMPUTES THE NUMBER OF SYMMETRIC BISECTIONS OF 

* A CHECKERBOARD OF EVEN ORDER. THE PROBLEM IS DESCRIBED IN 

* MARTIN GARDNER'S "MATHEMATICAL GAMES" IN SCIENTIFIC AMERICAN 


* NOVEMBER, 1962. 

* 

define!'AXIS(X,Y)') 3 

DEFINE('RIGHT(X,Y)') 4 

define!'LEFT(X,Y)') 5 

define!'up!x,y)') 6 

define!'down!x,y)') 7 

DEFINE!'COUNT!X)') 8 
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READ 

SUM = 0 



9 


N = INPUT 


;P(END) 

10 


BOARD = ARRAY (-N ' : ' N ' , ' 

-N ':' N) 


1 1 


BOARD<0,0> = ' : ' 



12 


AXIS(0,0) 



13 


OUTPUT = 'THERE ARE ' SUM 

' SYMMETRIC 

BISECTIONS OF A ' 

2 ♦ 14 

• 

A 

N ' BY ' 2 ♦ N ' CHECKERBOARD' 


:(READ) 

14 

♦ 

AXIS 

X = X + 1 



15 


EQ(X,N) COUNTO 


:S(RETURN) 

16 


IDENT(BOARD<-X,-Y>) 


:F(FRETURN) 

17 


IDENT(BOARD<X,Y>) 


:F(PRETURN) 

18 


BOARD<X,Y> = ':' 



19 


AXIS(X,Y) 



20 


UP(X,Y) 



21 

lit 

BOARD<X,Y> = 


:(RETURN) 

22 

RIGHT 

X = X + 1 



23 


EQ(X,N) COUNTO 


:S(RETURN) 

24 


IDENT( BOARD<-X,-Y>) 


;P(PRETURN) 

25 


IDENT(BOARD<X,Y>) 


:F(PRETURN) 

26 


BOARD<X,Y> = ' ; ' 



27 


RIGHT(X,Y) 



28 


UP(X,Y) 



29 


DOWN(X,Y) 



30 

A 

BOARD<X,Y> 


: (RETURN) 

31 

UP 

Y = Y + 1 



32 


EQ(Y,N) COUNTO 


:S(RETURN) 

33 


IDENT(BOARD<-X , -Y>) 


: F(PRETURN) 

34 


IDENT(BOARD<X , Y>) 


;F(PRETURN) 

35 


BOARD<X , Y> = ' : ' 



36 


RIGHT(X,Y) 



37 


UP(X , Y) 



38 


LEFT(X,Y) 



39 

* 

BOARD<X , Y> 


: (RETURN) 

40 

LEFT 

X = X - 1 



41 


EQ(X,-N) COUNTO 


: S(RETURN ) 

42 


IDENT ( BOARD<-X , -Y>) 


: P(PRETURN) 

43 


IDENT(BOARD<X , Y>) 


;F(PRETURN) 

44 


BOARD<X , Y> = ' ; ' 



45 


LEPT(X,Y) 



46 


UP(X,Y) 



47 


DOWN(X,Y) 



48 

A 

BOARD<X , Y> = 


: (RETURN) 

49 

DOWN 

Y = Y - 1 



50 
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EQ(Y,-N) count!) 

:S(RETURN) 

51 


IDENT(BOARD<-X,-Y>) 

:F(FRETURN) 

52 


IDENT(BOARD<X,Y>) 

:F(FRETURN) 

53 


BOARD<X,Y> = 


54 


RIGHT(X,Y) 


55 


LEFT(X,Y) 


56 


DOWN(X,Y) 


57 


BOARD<X,Y> = 

:(RETURN) 

58 

COUNT 

3|t 

SUM = SUM + 1 

:(RETURN) 

59 

END 



60 


For an input value of 2, this program produces the following output. 


STATEMENT 

13: 

LEVEL 

0 

CALL OF 

AXIS(0 , 

0).TIME = 

17 

STATEMENT 

20: 

LEVEL 

1 

CALL OF 

axis! 1 , 

0).TIME = 

34 

STATEMENT 

16: 

LEVEL 

2 

CALL OF 

COUNT! ' 

' ).TIME = 

34 

STATEMENT 

59: 

LEVEL 

2 

RETURN 

OF COUNT 

= ''.TIME 

= 34 

STATEMENT 

16: 

LEVEL 

1 

RETURN 

OF AXIS 

= ''.TIME 

= 50 

STATEMENT 

21 : 

LEVEL 

1 

CALL OF 

up! 1,0) 

.TIME = 50 


STATEMENT 

37: 

LEVEL 

2 

CALL OF 

right! 1 

.1).TIME = 

50 

STATEMENT 

24: 

LEVEL 

3 

CALL OF 

COUNT! ' 

' ).TIME = 

67 

STATEMENT 

59: 

LEVEL 

3 

RETURN 

OF COUNT 

= ''.TIME 

= 67 

STATEMENT 

24: 

LEVEL 

2 

RETURN 

OF RIGHT 

= ''.TIME 

= 67 

STATEMENT 

38: 

LEVEL 

2 

CALL OF 

up! 1 , 1 ) 

.TIME = 83 


STATEMENT 

33: 

LEVEL 

3 

CALL OF 

COUNT! ' 

' ).TIME = 

83 

STATEMENT 

59: 

LEVEL 

3 

RETURN 

OF COUNT 

= ''.TIME 

= 83 

STATEMENT 

33: 

LEVEL 

2 

RETURN 

OF UP = 

''.TIME = 

100 

STATEMENT 

39: 

LEVEL 

2 

CALL OF 

left! 1 , 

1 ).TIME = 

100 

STATEMENT 

46: 

LEVEL 

3 

CALL OF 

left!o, 

1 ).TIME = 

117 

STATEMENT 

46: 

LEVEL 

4 

CALL OF 

left!-1 

. 1).TIME = 

117 

STATEMENT 

42: 

LEVEL 

5 

CALL OF 

COUNT!' 

' ).TIME = 

117 

STATEMENT 

59: 

LEVEL 

5 

RETURN 

OF COUNT 

= ''.TIME 

= 133 

STATEMENT 

42: 

LEVEL 

4 

RETURN 

OF LEFT 

= ''.TIME 

= 133 

STATEMENT 

47: 

LEVEL 

4 

CALL OF 

UP!-1,1),TIME = 133 

STATEMENT 

33: 

LEVEL 

5 

CALL OF 

COUNT! ' 

').TIME = 

150 

STATEMENT 

59: 

LEVEL 

5 

RETURN 

OF COUNT 

= ''.TIME 

= 150 

STATEMENT 

33: 

LEVEL 

4 

RETURN 

OF UP = 

' '.TIME = 

150 

STATEMENT 

48: 

LEVEL 

4 

CALL OF 

DOWN!-1 

. 1).TIME = 

167 

STATEMENT 

52: 

LEVEL 

4 

FRETURN 

OF DOWN 

.TIME = 167 

STATEMENT 

49: 

LEVEL 

3 

RETURN 

OF LEFT 

= ''.TIME 

= 183 

STATEMENT 

47: 

LEVEL 

3 

CALL OF 

UP ! 0,1 ) 

.TIME = 183 

STATEMENT 

33: 

LEVEL 

4 

CALL OF 

COUNT! ' 

' ).TIME = 

183 

STATEMENT 

59: 

LEVEL 

4 

RETURN 

OF COUNT 

= ''.TIME 

= 200 

STATEMENT 

33: 

LEVEL 

3 

RETURN 

OF UP = 

''.TIME = 

200 

STATEMENT 

48: 

LEVEL 

3 

CALL OF 

DOWN! 0 , 

1 ).TIME = 

200 

STATEMENT 

52: 

LEVEL 

3 

FRETURN 

OF DOWN 

.TIME =217 

STATEMENT 

49: 

LEVEL 

2 

RETURN 

OF LEFT 

= ''.TIME 

= 217 
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STATEMENT 40; LEVEL 1 RETURN OP UP = '',TIME =217 
STATEMENT 22; LEVEL 0 RETURN OF AXIS = '',TIME = 217 
THERE ARE 6 SYMMETRIC BISECTIONS OF A 4 BY 4 CHECKERBOARD 


8.1.3 Label Tracing 

TRACE( name ,'LABEL') 

causes trace printout whenever transfer is made to name. No printout occurs if the 
statement labelled with the name is flowed into, or is entered as a function entry point. 

The following program, which converts numbers from hexadecimal form to decimal 
form, illustrates label tracing. 

1 
2 

3 

4 

5 

6 

7 

8 

9 

10 
1 1 
12 
13 


14 

15 

15 

16 
17 

The output for the indicated input is 

STATEMENT 10; TRANSFER TO DEHEX1,TIME = 16 

STATEMENT 10; TRANSFER TO DEHEX1,TIME = 16 

STATEMENT 10; TRANSFER TO DEHEX1,TIME = 16 

DEHEX(100) = 256 

STATEMENT 10; TRANSFER TO DEHEX1,TIME = 50 

STATEMENT 10; TRANSFER TO DEHEX1,TIME = 50 

DEHEX(OOOII) = 17 

STATEMENT 12; TRANSFER TO DEHEX1,TIME = 66 
DEHEXlOOOF) = 15 

STATEMENT 10; TRANSFER TO DEHEX1,TIME = 


STRIM = 1 

6TRACE = 100 

TRACE('DEHEX1','LABEL') 

PINDNO = BREAK(♦NO) . NO 

ONE = LEN(1) . NO 

LO = POS(O) SPAN('O') 


define!'dehex(str)no') 


(DEHEX.END) 


DEHEX STR LO 

DEHEX1 STR ONE = 

DEHEX = INTEGER(NO) 16 ♦ DEHEX + NO 
'ABCDEF' PINDNO 

DEHEX = 16 ♦ DEHEX + 10 + SIZE(NO) 

DEHEX.END 


F(RETURN) 
S(DEHEX1) 
F(FRETURN) 
(DEHEX1) 


READ NUMBER = INPUT ;F(END) 

+ OUTPUT = 'DEHEX(' NUMBER ') = ' DEHEX(NUMBER) 

+ ;S(READ) 

OUTPUT = 'UNABLE TO CONVERT ' NUMBER .-(READ) 


83 
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STATEMENT 12: TRANSFER TO DEHEX1,TIME = 99 

STATEMENT 12: TRANSFER TO DEHEX1,TIME = 99 

STATEMENT 12: TRANSFER TO DEHEX1,TIME = 116 

UNABLE TO CONVERT lABCG 

STATEMENT 10: TRANSFER TO DEHEX1,TIME = 133 

STATEMENT 12: TRANSFER TO DEHEX1,TIME = 133 

STATEMENT 12: TRANSFER TO DEHEX1,TIME = 149 

STATEMENT 12: TRANSFER TO DEHEX1,TIME = 149 

DEHEX(IABC) = 6844 

STATEMENT 12: TRANSFER TO DEHEX1,TIME = 199 

STATEMENT 12: TRANSFER TO DEHEX1,TIME = 199 

STATEMENT 12: TRANSFER TO DEHEX1,TIME = 199 

STATEMENT 12: TRANSFER TO DEHEX1,TIME =216 

DEHEX(OOOFACE) = 64206 


8.1.4 Keyword Tracing 

TRACE(name,'KEYWORD') 

causes trace printout when the value of the named keyword is changed. Only four 
keywords can be traced: ERRTYPE, FNCLEVEL, STCOUNT, and STFCOUNT. The 
following program, which converts numbers from decimal to hexadecimal form, illustrates 
keyword tracing. 

STRACE = 1000: £TRIM = 1 1 


TRACE{'STFCOUNT','KEYWORD') 
define!'HEXER{N)Q,R') 

HIGITS = '0123456789ABCDEF' 
LOCATN = LEN(*N) LEN(1) 

LOCATR = LEN{*R) LEN{1) 

READ NUM = INPUT 

OUTPUT = 'HEXER(' NUM ') = 
OUTPUT = 'UNABLE TO CONVERT 

* 

* 

HEXER integer!N) 

Q = GT!N,15) N / 16 
R = N - Q * 16 
N = Q 

HIGITS LOCATR 

HEXER = R HEXER 

HEX.END HIGITS LOCATN 

HEXER = R HEXER 


3 

4 

5 

. R 6 

. R 7 

:F!END) 8 

HEXER!NUM) :S!READ) 9 

' NUM :!READ) 10 

:F!FRETURN) 11 

:F!HEX.END) 1 2 

13 

14 

:F!FRETURN) 15 

:!HEXER) 16 

:F!FRETURN) 17 

:!RETURN) 1 8 

19 


END 
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The printed output for the indicated input is 


STATEMENT 

12: 

£STFCOUNT 

= 

1,TIME = 

17 

HEXER(15) = F 






STATEMENT 

12: 

£STFCOUNT 

= 

2, TIME = 

33 

HEXER(6844) = 

1ABC 




STATEMENT 

1 1 : 

£STFCOUNT 

= 

3, TIME = 

50 

STATEMENT 

9: 

£STFCOUNT = 

= 

4,TIME = ( 

67 

UNABLE TO CONVERT 

1239.0003 




STATEMENT 

12: 

£ STFCOUNT 

= 

5, TIME = 

83 

HEXER(236) = ] 

EC 





STATEMENT 

12: 

£STPCOUNT 

= 

6, TIME = 

100 

HEXER(64206) = 

= FACE 




STATEMENT 

12: 

£STFCOUNT 

= 

7, TIME = 

133 

HEXER(699050) 

= AAAAA 




STATEMENT 

12: 

£STFCOUNT 

= 

8,TIME = 

150 

HEXER(O) = 0 






STATEMENT 

12: 

£STFCOUNT 

= 

9, TIME = 

166 

HEXER(OOO) = I 

0 





STATEMENT 

12: 

£STPCOUNT 

= 

10,TIME = 

= 183 

HEXER(256) = 

100 





STATEMENT 

12: 

£STFCOUNT 

= 

11,TIME = 

= 266 

HEXER(123456789) 

= 75BCD15 




STATEMENT 

8: 

£STFCOUNT = 

= 

12,TIME = 

266 


Note that the keyword name, without the ampersand, is given in the call of TRACE. 


8.1.5 Discontinuation of Tracing 

Tracing is a global condition that depends upon the value of £ TRACE. Regardless of 
trace requests made through the TRACE function, there is no trace output if £ TRACE is 
not positive. The value of £ TRACE may be set to zero explicitly, or may reach zero as it 
is decremented as the result of tracing. Individual trace associations may be cancelled, 
however, by executing 

STOPTR (name,type) 

which cancels a single trace association for the name and type. Thus the tracing of 
statement failure is stopped by executing 

STOPTR(’STPCOUNT','KEYWORD’) 

If an attempt is made to cancel a trace association that has not been made, the call of 
STOPTR fails. 
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8.2 Programmer-Defined Trace Functions 

The TRACE function has an optional fourth argument that permits the programmer to 
supply procedures for tracing. The complete form of the function is 

TRACE(name,type,tag,function) 
where function is a programmer-defined function. 

When the traced action occurs, the function is called with name as its first argument 
and tag as its second argument. The keywords £ TRACE and £FTRACE are automatically 
set to zero on entry to a programmer-defined trace procedure and restored on return. This 
prevents accidental tracing of a trace procedure. The programmer may change the values 
£ TRACE and £FTRACE while in a trace procedure. Upon return from the function, the 
former values are restored. 


8.2.1 Invoking Programmer-Defined Trace Procedures 

The time at which a programmer-defined trace procedure is called depends on the type 
of trace. 

(1) VALUE: just after assignment of the new value. 

(2) CALL: just after evaluation of the arguments, but before execution of the first 
statement in the function. 

(3) RETURN: just before the return is made. 

(4) FUNCTION: same as for CALL and RETURN. 

(5) LABEL: just before transfer to the label. 

(6) KEYWORD: just after the keyword is changed. 


8.2.2 Tools for Writing Programmer-Defined Trace Procedures 

Special information is required for writing more elaborate programmer-defined trace 
procedures. Three keywords and three functions are provided expressly for this purpose. 

1. £STNO is a protected keyword whose value is the statement number of the statement 
currently being executed. 

2. £LASTNO is a protected keyword whose value is the statement number of the last 
statement executed. 
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3. £RTNTYPE is a protected keyword whose value is the type of return (RETURN, 
FRETURN , or NRETURN ) made by the last defined function to return. 

4. ARG( f unction,n) is a function whose value is the name of the nth argument of the 
programmer-defined function. ARG is useful in writing programmer-defined trace 
procedures that trace several functions and need to determine the names of the formal 
arguments of the functions being traced. 

5. LOCAL(function,n) is a function whose value is the name of the nth local variable 
of the defined function. 

6. FIELD (data type,n) is a function whose value is the name of the nth field of the 
programmer-defined data type. 

The following example illustrates a programmer-defined function, VALTR, that prints a 
trace output only when a traced variable is assigned a specified value. KEY is a global 
variable. Trace output only occurs when a traced variable is assigned the value of KEY . If 
the variable being traced is not a string, the tag is used in the printed output. Use of this 
function is illustrated in the following program, which produces trace output when certain 
variables are assigned the value 2 5 . 


POWER = ARRAY('25,5') 1 

KEY =25 2 

DEFINE('VALTR(VAR,TAG)ST,TIME') 3 

6TRACE = 1000 4 

TRACE Cl',' VALUE ' , , ' VALTR' ) 5 

TRACE(.POWER<5,2>,'VALUE','5 ♦♦ 2','VALTR') 6 

TRACE(.POWER<25,1>,'VALUE','25 ♦♦ 1','VALTR') 7 

* 

* SET UP MATRIX OF INTEGER POWERS 

* 

J = 1 8 

NEXTI 1 = 0 9 

NEXTP 1=1+1 10 

POWER<I,J> = I ♦♦ J ;S(NEXTP) 11 

J = LT(J,5) J + 1 ;S(NEXTI)F(END) 12 

* 

VALTR ST = 6LASTNO 13 

TIME = TIMEO 14 

IDENT($VAR,KEY) ;F(RETURN) 15 

TAG = IDENT(DATATYPE(VAR),'STRING') VAR 16 

OUTPUT = 'KEY VALUE "' KEY '" ASSIGNED TO ' TAG 17 

+ ' IN STATEMENT ' ST ' AT TIME ' TIME 17 

+ ;(RETURN) 17 

END 18 
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The printed output is 


KEY 

VALUE 

"25" 

ASSIGNED 

TO 

I 

IN 

STATEMENT 

10 

AT 

TIME 

67 


KEY 

VALUE 

"25" 

ASSIGNED 

TO 

25 

** 

1 IN STATEMENT 

1 1 AT 

TIME 

83 

KEY 

VALUE 

"25" 

ASSIGNED 

TO 

5 

** 

2 IN STATEMENT 

11 AT 

TIME 

117 

KEY 

VALUE 

"25" 

ASSIGNED 

TO 

I 

IN 

STATEMENT 

10 

AT 

TIME 

167 


KEY 

VALUE 

"25" 

ASSIGNED 

TO 

I 

IN 

STATEMENT 

10 

AT 

TIME 

250 


KEY 

VALUE 

"25" 

ASSIGNED 

TO 

I 

IN 

STATEMENT 

10 

AT 

TIME 

333 


KEY 

VALUE 

in 

ASSIGNED 

TO 

I 

IN 

STATEMENT 

10 

AT 

TIME 

416 



8.3 Other Tracing Techniques 

Only programmer-defined functions can be traced. Occasions arise, however, when it is 
desirable to trace primitive functions, or even operators. The effect of such a trace can be 
achieved by using programmer-defined functions in conjunction with OPSYN. Suppose, for 
example, a trace of the function SIZE is desired. Execution of the statements 

OPSYNf'SIZE.','SIZE') 
define!'SIZE(S) ' ) 

TRACE('SIZE','FUNCTION') 

in conjunction with the procedure 

SIZE SIZE = SIZE.(S) :(RETURN) 

provides a suitable redefinition of SIZE that performs the same operation as the primitive 
function SIZE, and which can be traced as indicated. 

Operators can be traced in a similar manner. Two applications of OPSYN are required 
as indicated in the following example that traces addition of data given on input cards. 

define!'SUM!X,Y)') 

OPSYN!,2) 

OPSYN!'+','SUM',2) 

STRIM = 1; 6FTRACE = 100 

* 

ADD TOTAL = TOTAL + INPUT ;S!ADD) 

OUTPUT = 'THE TOTAL IS ' TOTAL :!END) 

* 

SUM SUM = X # Y : (RETURN) 

* 

END 

The printout, for the data indicated, is 

STATEMENT 6; LEVEL 0 CALL OF SUM!'','123'),TIME = 17 
STATEMENT 8: LEVEL 0 RETURN OF SUM = 123,TIME = 17 
STATEMENT 6: LEVEL 0 CALL OF SUM!123,'4.92'),TIME = 33 
STATEMENT 8: LEVEL 0 RETURN OF SUM = 127.9199,TIME =33 
STATEMENT 6: LEVEL 0 CALL OF SUM!127.9199,'87'),TIME = 50 


1 

2 

3 

4 

6 

7 

8 
9 
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STATEMENT 

8: 

LEVEL 

0 

RETURN 

OF SUM =214. 

9199 

,TIME = 

50 


STATEMENT 

6: 

LEVEL 

0 

CALL OF 

SUM(214.9199 

r '0. 

05'),TIME = 

66 

STATEMENT 

8: 

LEVEL 

0 

RETURN 

OF SUM = 214. 

9699 

,TIME = 

66 


STATEMENT 

6: 

LEVEL 

0 

CALL OF 

SUM(214.9699 

' 3 

9 J • 

17'),TIME = 

83 

STATEMENT 

8: 

LEVEL 

0 

RETURN 

OF SUM = 218. 

1399 

,TIME = 

83 


STATEMENT 

6: 

LEVEL 

0 

CALL OF 

SUM(218.1399 

, '20 

'),TIME 

= 100 

STATEMENT 

8: 

LEVEL 

0 

RETURN 

OF SUM = 238. 

1399 

,TIME = 

100 



8.4 Dumping Natural Variables 

A listing of natural variables and their values can be obtained by executing the function 
DUMP . A call of DUMP ( N ) provides this dump if N is a nonzero integer. The list of 
variables is not alphabetized, and variables that have null string values are omitted. If N 
is zero, no dump is given. Thus diagnostic dump statements can be placed in a program 
and the dumping turned on and off globally by assigning nonzero and zero values to the 
argument of DUMP. 


Exercises 

Exercise 8.1: Write a programmer-defined trace procedure that produces trace printout of 
the same form that is produced by standard value tracing. 

Exercise 8.2: Write a programmer-defined trace procedure and use it to trace 50 variables, 
COLn (n=1 , ,50), so that when COLn is assigned a value, a copy of the value is 

printed, indented n columns. 

Exercise 8.3: Write a programmer-defined trace procedure that traces OUTPUT so that 
when EOF is assigned to OUTPUT, £ OUTPUT is set to zero to stop further output. 



CHAPTER 9 


Input and Output 


Input and output are accomplished by using variables associated with data sets (files). 
In the case of a variable associated in the output sense, each time the variable is assigned 
a value, a copy of the value is put out onto the associated data set. In the case of a 
variable associated in the input sense, each time the value of the variable is used, a new 
value is read from the associated data set and becomes the new value of the variable. 
Thus, input and output go on during program execution without any explicit I/O 
statements, as a result of I/O associations. 

Details of the material contained in this chapter vary from machine to machine. 
Material appearing here refers specifically to the IBM System/360 operating under OS. 


9.1 Printed Output 

The variable OUTPUT is associated with the system print data set. Consequently, 
whenever OUTPUT is assigned a value, printout is generated. For example, 

OUTPUT = 'THE SELECTED VALUES ARE' 

produces the output 

THE SELECTED VALUES ARE 

Output may also result from value assignment specified in patterns. For example, 

PEXP = BAL . EXP1 . OUTPUT '+' BAL . EXP2 . OUTPUT 


EXP PEXP 

prints the two terms in EXP, and assigns their values to EXP1 and EXP2. This type of 
output is often useful for diagnostic purposes, and does not affect the pattern matching or 
the assignments made to EXP1 and EXP 2 . 

Ordinary printout is printed 132 characters per line, with as many lines as necessary 
being generated. Carriage control is provided to give single spacing. The null string is 
treated as a blank character and a blank line is printed for it. Strings are usually assigned 
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to output variables. If an object other than a string is assigned to an output variable, the 
string representation of the object (as described in Chapter 7) is printed. 


9.2 Punched Output 

The variable PUNCH is associated with the standard punch data set. Consequently, 
whenever PUNCH is assigned a value, a punched card is generated. For example, 

PUNCH = 0 

produces a card with zero punched in column one. 

All the remarks about print output apply to punch output, except that 80 characters are 
punched per card, with additional cards punched as necessary for longer strings. The 
cards have no sequence numbering or identification unless provided in the strings that are 
punched. 


9.3 Input 

The variable INPUT is associated with the standard input data set. Whenever the value 
of INPUT is used, a card image is read from the input stream and becomes the new value 
of INPUT . For example, 

OUTPUT = INPUT 

reads a card image and prints it. Similarly, 

INPUT BAL . EXP 

reads a card image and attempts to match a balanced string. If 6TRIM is zero, all 80 
columns of the card images are read, and the value of INPUT is an eighty character 
string. If 6TRIM is nonzero, trailing blanks are automatically deleted. 

Since each use of INPUT reads a card image, previous values of INPUT are lost unless 
they are assigned to other variables. 

If the end of the input data set is encountered when a value of INPUT is requested, 
failure results. This failure can be used to detect the end of an input data set. For 
example, the following statements 

I = 1 

READ DATA<I> = INPUT 
I = 1+1 

OUT 

read card images into the array DATA until the input data stream is exhausted (or I 
exceeds the range of DATA). Control is then transferred to OUT. 


; F(OUT) 
: (READ) 
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9.4 The I/O System 

All input/output is handled by FORTRAN IV I/O routines. That is, SNOBOL4 I/O 
is done by the same system that does I/O for FORTRAN IV object programs. 
Consequently, the conventions and I/O concepts specified for the FORTRAN IV language 
also apply to SNOBOL4. In addition, the version of the language described here operates 
under OS/360. It is necessary to understand both the fundamentals of FORTRAN IV 
I/O [7,8] and job control language ( JCL) [9] in order to use the I/O facilities of 
SNOBOL4 effectively. 

In FORTRAN, data sets (files) have numbers (data set reference numbers). These 
numbers are referred to in source-language programs and are associated at run time with 
specific data sets by JCL statements. There are three standard data sets: 

normal input stream ( 5) 
normal print output ( 6) 
normal punch output ( 7) 

DDNAMEs in JCL are used to associate the data set reference numbers with actual 
data sets. DDNAMEs for FORTRAN have the form FTxxFyyy, where xx corresponds to 
the data set reference number and yyy is a file sequence number for multifile data sets. 
The typical DD cards used in SNOBOL4 associate the standard data set reference 
numbers 5,6, and 7 as follows: 

//FT06F001 DD SYSOUT=A 
//FT07F001 DD UN1T=PUNCH 
//FT05F001 DD ♦ 

This JCL, or its equivalent, is contained in the SNOBOL4 cataloged procedure, and is 
supplied automatically when the cataloged procedure is used. 

A wide range of devices and record structures can be specified on DD cards. By 
changing the DD cards, the data streams can be assigned to different data sets at run 
time. Thus, 

//FT05F001 DD DSN=PROG1,V0L=SER=BTLXX1,UNIT=DISK,DISP=OLD 

specifies an input stream from a data set PROG 1 on a disk file. Similarly, 

//FT07F001 DD DSN=PUNCHER,V0L=SER=MYSAV1,UNIT=9TRACK,DISP=(.PASS), 

// DCB=(RECFM=FB,LRECL=80,BLKSIZE=800),LABEL(1,SL) 

causes punched output to go onto a 9-track tape with a blocking factor of ten. 

A complete discussion of DD statements is beyond the scope of this book, and is an 
involved and difficult subject. The important fact is that JCL permits the specification of 
a wide variety of devices and record structures. This specification is made when the 
program is run and requires no alteration of the program. 
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FORTRAN supports multifile data sets. The last three characters in the DDNAME 
specify the file number. When FORTRAN comes to the end of a file, it automatically 
opens the next file of the same data set reference number. 

Thus, for example, input may come from several files: 

//FT05F002 DD DSN=DATA2,UNIT=DISK,V0L=SER=BTLH04,DISP=OLD 
//FT05F001 DD ♦ 


/♦ 

With these DD cards, when the in-line data stream is exhausted, failure occurs. A 
subsequent attempt to input reads data from DATA2 . In a multifile data set, failure occurs 
each time the end of a file is reached. 


9.5 Output Associations 

The variables OUTPUT and PUNCH have predefined output associations. Programmer- 
defined associations may be made using the function OUTPUT . The form of the function is 


OUTPUT(name,number,format) 

OUTPUT associates the name with the data set reference number according to the given 
format. The format is a string specifying a FORTRAN IV format. The following 
statements correspond to the associations for the variables OUTPUT and PUNCH: 

OUTPUT('OUTPUT',6,'(IX,132A1)') 
output!'punch',7,'{80A1)') 

Using the OUTPUT function, any variable can be associated with any data set reference 
number. For example, 

PRFORM = '(1X,132A1)' 

TEST = array!'8,8') 

OUTPUT!.TEST<1,1>,6,PRFORM) 

OUTPUT!.TEST<8,8>,6,PRFORM) 

associate the array elements TEST<1,1> and TEST<8,8> with the ordinary print data 
set and with the standard print format. As a result, whenever either TEST<1 , 1> or 
TEST<8,8> is assigned a value, the new value is printed. 

Data set reference numbers are not restricted to 5, 6, and 7, but can range from 1 
through 9 9. Associations can be made with data set reference numbers other than the 
standard ones. In this case, a DD statement for that number must be provided when the 
program is run. For example 
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OUTPUT('TEXT' ,7, '(80A1)') 
associates TEXT with the punch data set. On the other hand, 

OUTPUT('TEXT',20,'(80A1)') 
and the DD statement 

//FT20F001 DD DSN=NEWF,UNIT=TAPE,VOL=SER=MYSAV1,LABEL=(2,SL), 

// DISP=(,PASS),DCB=(RECFM=FB,BLKSIZE=800,LRECL=80) 

allow the program to put card images onto the second file of a tape. The LRECL 
parameter of 80 and the format (80A1) relate the record size of the file to the record 
size in the format. 

Formats used in output association must specify the conversion of at least one element 
by A-conversion. (Normally nA1 -conversion is used.) Integers and reals are converted into 
strings by SNOBOL4. F- and I-conversion must not be used. In addition to 
A-conversion, quoted literals, X-, H-, T-, and Z-conversion may be specified [7,8]. Carriage 
control must be provided for printing; otherwise the first character of the string is 
consumed for this purpose. Consider 

output!'title',6,'(1H1,132A1/(IX,132A1))') 

When a value is assigned to TITLE, a page is ejected and the value titles the next page 
of output. The use of literals is illustrated by 

output!'sum' ,6,"! ' SUM=',128A1/!IX,132A1))") 

which includes identifying information with the format. Subsequently, 

SUM = 300 

causes the printout 

SUM=300 

On the IBM 360, the largest number accepted in format statements is 25 5. Larger 
numbers are reduced modulo 256. Thus 

output!'out',10,'!400A1)') 

is equivalent to 

output!'out',10,'!144A1)') 

The effect of larger numbers can be obtained by using repetition. For example, 
output!'out', 10,'!20!20A1))') 
specifies 400 -character output. 

The predefined associations can be changed. Thus, 
output!'output', 6,'!IX,120A1)') 
shortens the line length for OUTPUT to 120 characters. 



9.7.1 


BACKSPACE 


169 


The second and third arguments of OUTPUT can be defaulted. If the second argument is 
zero or null, the standard output unit, 6, is assumed. If the third argument is null, the 
standard print format, ( IX, 132A1) , is assumed. Thus 

OUTPUT(’OUT’) 

provides the same association as is predefined for OUTPUT. 


9.6 Input Associations 

Programmer-defined input associations can be made using the function INPUT. The 
form of this function is 

INPUT( name,number,length) 

INPUT associates the name with the data set reference number, and specifies that the 
resulting string is to have the given length. (Notice in particular that no format is 
specified.) INPUT has a predefined association equivalent to 

input!'input',5,80) 

If the length is less than the record size on the data set being read, the last part of the 
record is lost. Hence, 

input!'input',5,72) 

changes the association for INPUT so that only 72 columns are read. Columns 73 through 
80 are lost if data set reference number 5 is associated with ordinary card input. A length 
longer than the record size should not be specified. 

The second and third arguments of INPUT can be defaulted. If the second argument is 
zero or null, the standard input unit, 5, is assumed. If the third argument is zero or null, 
the standard length of 80 is assumed. Thus 

input!'in') 

provides the same association for IN as is predefined for INPUT. 


9.7 Other I/O Functions 

Several other functions are provided for I/O-related operations [7,8]. All of these 
functions return the null string as value. 
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9.7.1 BACKSPACE 

BACKSPACE ( number ) backspaces one record on the data set associated with the number. 
For example, 

backspace!15) 

backspaces a record on the data set associated with data set reference number 15. 


9.7.2 DETACH 

DETACH (name) removes any input and output association that the name may have. For 
example, 

DETACH('OUTPUT') 
terminates normal print output. 


9.7.3 ENDFILE 

ENDFILE ( number ) writes an end of file on (closes) the data set specified by the number. 
For example, 

ENDFILE(20) 

closes the data set associated with data set reference number 2 0. 


9.7.4 REWIND 

REWIND ( number ) repositions the data set associated with the number to the first file. For 
example, 

REWIND(10) 

rewinds the data set associated with data set reference number 10. Subsequently, reference 
to 10 refers to the beginning of the data set specified by FT10F001 (even if 10 is a 
multifile data set). 
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9.8 Turning Off Input and Output 

Input and output occur automatically as a result of associations with variables, provided 
6 INPUT and 6OUTPUT are nonzero. Both these keywords have initial default values of 
one. 

In some circumstances, it may be desirable to turn off input or output, or both. This 
may be achieved by setting the corresponding keyword to zero. Ifformal automatic input 
and output can be restored by assigning nonzero values to the keywords. 


CHAPTER 10 


Running a SNOBOL4 Program 


Most of this book is devoted to language features and how they are used in 
programming. This chapter is concerned with the running of a SNOBOL4 program. 

A SNOBOL4 run consists of three parts; 

(1) compilation, 

(2) execution, and 

(3) termination. 


10.1 Compilation 

During compilation, the SNOBOL4 system is initialized and the source program is 
converted into intermediate object code, which is in a form suitable for interpretation 
during execution. Compilation uses the same procedures that are used by the CODE 
function. Additional procedures are used to read the source program from the input data 
set, print a compilation listing on the output data set, and note errors detected in the 
source program. 


10.1.1 Source-Program Input 

Input to the compiler comes from the standard input stream associated with data set 
reference number 5 . On the IBM 360, the compiler begins reading the program from the 
data set associated with FT05F001 . Only the first 72 characters of a line are processed by 
the compiler, and columns 73 through 80 can be used for sequential numbering or other 
identification. 

The compiler continues to read until it encounters the end statement of the program. If 
an end of file is encountered before the end statement is found, the compiler goes to the 
next file for reference number 5. The input program may therefore be in several sections 
given byFT05F001,FT05F002, etc. When the end statement is encountered, compilation 
stops. If the end statement is missing from the program, the compiler tries to find the next 
file, assuming more program remains to be compiled. When this file is not found, an error 
message is printed by the FORTRAN I/O system and the run is terminated. 
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10.1.2 Source Listing 

A listing of the source program with identifying statement numbers is printed on the 
standard output data set. The format of this listing is illustrated by examples later in this 
chapter. These examples have been slightly modified to fit within the page width of this 
book, but are otherwise accurate. 


10.1.3 Listing Control 

Listing of the source program and placement of statement numbers is controlled by 
control lines appearing in the source program. A minus sign at the beginning of a line 
identifies that line as a control line. There are three controls: LIST, UNLIST, and EJECT. 


LIST turns on listing of the source program and controls placement of the statement 
numbers. The control line 

-LIST LEFT 

causes placement of statement numbers at the left side of the listing. The control line 
-LIST RIGHT 

causes placement of statement numbers at the right. In the initial, default mode, the 
source program is listed with numbers at the right. The control line 

-LIST 

is equivalent to 
-LIST RIGHT 

The control line 
-UNLIST 

turns off listing of the source program. Listing may be restored by using LIST. 

A page eject in the source listing may be obtained by the control line 
-EJECT 

Blanks may appear between the minus sign and the control word. One or more blanks 
must separate LIST from LEFT or RIGHT. Any characters other than LEFT following 
blanks on the LIST control line cause the same action as RIGHT. An erroneous control 
line is ignored. 
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10.1.4 Operator Precedence and Associativity 

SNOBOL4 permits considerable flexibility during program execution. Functions can be 
defined then, data objects created, and operators redefined. The precedence and 
associativity of operators cannot be changed, however. Precedence and associativity are 
properties of the operator symbol and are used during compilation to determine the 
structure of expressions. For example, the statement 

P = AB$CD|Q 
is equivalent to 

P = ((A (B $ O) D) I Q 

Of course, parentheses can always be used to override the built-in precedence and 
associativity. The statement 

P = (A B) $ C (D 1 Q) 

is quite different than the previous statement. 

For reference purposes, the precedence and associativity of the binary operators is listed 
in the following table. 


Binary Operators 


Graphic 

Definition 

Associativity 

Precedence 


none 

right 

12 

? 

none 

left 

12 

$ 

immediate value assignment 

left 

11 

• 

conditional value assignment 

left 

11 

! 

exponentiation 

right 

10 

% 

none 

left 

9 

* 

multiplication 

left 

8 

/ 

division 

left 

7 

« 

none 

left 

6 

+ 

addition 

left 

5 


subtraction 

left 

5 

S 

none 

left 

4 

blank 

concatenation 

left 

3 

1 

alternation 

left 

2 

£ 

none 

left 

1 
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10.1.5 Errors Detected during Compilation 

Certain kinds of errors in the source program are detected during compilation. When an 
error is detected in a statement, compilation of that statement is terminated, and an error 
message is printed below the statement. A marker indicating the vicinity of the error is 
printed also. This marker indicates how much of the statement the compiler had accepted 
before detection of the error. Depending on the nature of the error, this marker may be 
before or after the point of the error. A typical error is failure to place blanks around a 
binary operator. An example, with the resulting error message, is 

X = A+B 

I 

*** ILLEGAL CHARACTER IN ELEMENT 

The compiler detects this error as an erroneous character ( +) in an element, and refuses 
to accept the element as indicated by the marker. 

Since compilation of a statement stops when an error is encountered, only the first error 
in any one statement is detected. Compilation continues with the next statement. However 
if more than fifty erroneous statements are encountered, compilation of the program ceases 
and the program is not executed. 


10.1.6 Compilation Error Messages 
A list of compilation error messages follows. 

1. BINARY OPERATOR MISSING OR IN ERROR. This error indicates an erroneous binary 
operator or a missing blank between expressions. Some examples are 

X = F(X)*** 2 

TEXT = ' ( ' TEXT ' ) ' 

M = (A B)N 

2. ERRONEOUS INTEGER. This error occurs if an integer literal in a statement exceeds the 
magnitude of the largest possible integer. On the IBM 360, the largest integer is 2 ^ -1 . 

3. ERRONEOUS LABEL. This error occurs if the first character of a statement is not a 
blank, integer, letter, ♦ , or -. 

4. ERRONEOUS OR MISSING BREAK CHARACTER. This error occurs if a break character 
appears in an erroneous context, or if a nested expression is not closed. Some examples are 


X = (A,B) 

A<1,2 ) = 5 

F(G(X) :S(LOOP) 

5. ERRONEOUS REAL NUMBER. This error occurs if a real literal that is too large or too 
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small appears in a statement. On the IBM 360, reals have the approximate range 10"^® 
to 10 ' ® . 

6. ERRONEOUS SUBJECT. This error occurs if an erroneous construction appears in the 
subject field. An example is 

, = 2 

7. ERROR IN GOTO. This error occurs when a syntactic error is found in the goto field. 
Some typical errors are 

:S(L1)S(L2) 

;S(A1) ;P(A2) 

:<CODE) 

8. ILLEGAL CHARACTER IN ELEMENT. This error typically occurs when blanks are not 
provided where required. Some examples are 

X = 3: 

E = 3.5P 

9. IMPROPERLY TERMINATED STATEMENT. This error occurs when a statement 
terminates before a construction is complete. An example is 

N = M + 

10. PREVIOUSLY DEFINED LABEL. This error occurs when a duplicate label is 
encountered. The first occurrence of a label holds, and subsequent occurrences are 
erroneous. 

11. UNCLOSED LITERAL. This error occurs when a closing quotation mark is omitted. 
Examples are 

LETTER = 'A 

TEXT = 'HE YELLED STOP" 


10.2 Execution 

Execution of the compiled program begins when compilation is complete. Ordinarily, 
program execution begins with the first statement of the program. Program execution can 
be started with any labelled statement by specifying that label in the end statement. The 
label of the first statement to be executed is placed in the position of the subject. For 
example, 

END INIT 

causes program execution to begin with the statement labelled INIT. The end statement 
END END 

has the effect of a compile-only run. 
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10.2.1 The Sequence of Evaluation 

An understanding of the sequence of evaluation requires an understanding of the overall 
evaluation of a statement in terms of its major components. There are three major types of 
statements: assignment, pattern matching, and replacement. These have the forms: 

label subject = object goto 

label subject pattern goto 

label subject pattern = object goto 

Labels and gotos are optional. The object may be explicitly omitted, in which case the 
object is taken to be an expression that has the null string as value. 

There are two degenerate statement forms as well: 

label subject goto 

label goto 

Labels and gotos are optional in these forms as well. Thus a blank line is an acceptable 
statement. The replacement statement is the most complicated and general form and is 
used for illustration. All other statement forms can be considered formally as degenerate 
replacement statements, and the evaluation of the degenerate forms can be understood 
from the evaluation of the replacement statement by skipping the missing components. The 
sequence of evaluation is: 

1. The label requires no evaluation, and in fact is not part of the statement at all. It 
merely serves to identify the statement. 

2. The subject is evaluated first. If the evaluation of the subject fails, the statement fails, 
the goto is processed, and evaluation of all other components is skipped. If no failure goto 
is specified, control passes to the next statement. 

3. The pattern is evaluated next. If this evaluation fails, the statement fails and the goto is 
processed as in the case of subject failure. 

4. The pattern match is performed next. If the pattern match fails, the statement fails, 
conditional value assignment is not performed, the replacement is skipped, and the goto is 
processed. Immediate value assignment, and other effects which occur dynamically during 
pattern matching, may take place before the pattern match fails. If pattern matching 
succeeds, conditional value assignment is performed for those components that matched. 

5. The object is evaluated. If this evaluation fails, the statement fails, no replacement is 
performed, and the goto is processed. 

6. The replacement is performed. 

7. The goto is processed. Goto processing depends on the structure of the goto and 
whether or not the statement fails. If the statement succeeds, only an unconditional or 
success goto in the statement is evaluated. If the statement fails, only an unconditional 
goto or failure goto in the statement is evaluated. Transfer is made to the evaluated goto. 
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if there is one, or control is passed to the next statement. Failure in evaluation of a goto 
is an error. 

Any of the components of a statement may be arbitrarily complicated and may invoke 
many kinds of processes. Calls to programmer-defined functions can occur, for example, in 
any component of a statement (except the label), and even take place in the middle of 
pattern matching as the result of the evaluation of unevaluated expressions. 

Within an expression, the order of evaluation depends on the order of the components 
and the operations performed on them. Evaluation of the components of an expression is 
from left to right. In complicated expressions, components are nested, and the order of 
evaluation may be determined by examining the fully parenthesized form of the expression 
as determined from the rules of precedence and association. Consider the expression 

(K L F(A + B ♦ C)) 
which has the fully parenthesized form 



P((A + (B ♦ C)))) 


4 5 6 



/ 


10 


The order of evaluation of this expression is as indicated. If F is a programmer-defined 
function, its evaluation involves the execution of other statements and may in itself be 
very complicated. 

In order to understand how failure is handled, it is important to know what operations 
can fail. 


1. Obtaining the value of a variable fails if the variable has an input association 
and an end-of-file condition is encountered. An attempt to read occurs only if the 
value of the variable is required, not merely because the variable appears in a 
statement. Thus, neither 

INPUT = 'O' 


nor 


LT(N,M) 


; S(INPUT) 
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requires the value of INPUT and hence no attempt is made to read. 

2. Primitive predicates fail if the stated condition is not met. The unary negation 

operator, for example, fails only if its operand does not fail. 

3. The primitive functions ARG, CODE, COLLECT, CONVERT, DUPL, EVAL, FIELD, 

LOCAL, REPLACE , and STOPTR fail in certain circumstances. 

4. Array references fail if an index is out of bounds. 

5. Pattern matching may fail for a variety of reasons. 

6. Programmer-defined functions fail by transferring to FRETURN. 

Failure is a condition that causes a process to terminate and return to the process that 
called it, which in turn terminates and passes the failure condition back, until eventually 
the statement itself fails. The exception is the unary negation operator that converts a 
failure condition into successful evaluation, and converts successful evaluation into a 
failure condition. A statement fails if failure occurs in the evaluation of its subject or 
pattern, if pattern matching fails, or if failure occurs in evaluation of its object. 

Details of function evaluation deserve special note. All the arguments to a programmer- 
defined function are evaluated before the function is called. If too many arguments are 
provided to the call of a programmer-defined function, the extra arguments are evaluated, 
but not passed. If the evaluation of any argument fails, a failure condition is returned and 
the function is not entered. 

Primitive functions are called before their arguments are evaluated, and each function 
evaluates its own arguments. Too many arguments in the call of a primitive function is an 
error. If too few arguments are provided in the call of a primitive function, null strings 
are provided for the omitted arguments. An exception to this rule concerns functions 
invoked by APPLY or called through an OPSYNed synonym. Such calls must contain the 
correct number of arguments. 


10.2.2 Error Conditions 

As indicated throughout this book, there are a number of errors that may occur during 
program execution. Some errors are relatively insignificant, while others are so severe that 
continued program execution is impossible. Error conditions are therefore divided into two 
categories: 
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(1) conditionally fatal, and 

(2) unconditionally fatal. 

The handling of conditionally fatal errors is controlled by two keywords. SERRLIMIT 
limits the number of conditionally fatal errors that are allowed. If a conditionally fatal 
error occurs and the value of &ERRLIMIT is greater than zero, the value of ERRLIMIT is 
decremented by one, and failure occurs in the expression with the error. If the value of 
6 ERRLIMIT is zero, program execution terminates with an error message. For example, 

6ERRLIMIT = 10 

permits ten conditionally fatal errors. The eleventh error causes error termination. The 
initial, default value of SERRLIMIT is zero, so that any error causes error termination 
unless the value of SERRLIMIT is changed. 

Each error has a number. When the error occurs, this number becomes the value of the 
protected keyword SERRTYPE, serving to identify the error. ERRTYPE can be traced, 
permitting diagnosis of conditionally fatal errors under program control. 


10.2.3 Program Error Messages 

A list of errors follows. Errors 1 though 16 are conditionally fatal. Errors 17 through 28 
are unconditionally fatal and cause error termination regardless of the value of 
SERRLIMIT. 

1. ILLEGAL DATA TYPE. This error is one of the most frequent to occur. A typical 
example is 

X = X + 'A' 

since A is not a numeral string. 

2. ERROR IN ARITHMETIC OPERATION. This error occurs if an attempt is made to 
divide by zero, raise zero to a nonpositive power, or if the result of an arithmetic 
operation exceeds the magnitude for integers or reals. On the IBM 360, integers have the 
range of -2 ’ ’ to 2 ^ -1 and reals have the approximate range 10'^* to 10 ^ ® . 

3. ERRONEOUS ARRAY OR TABLE REFERENCE. This error occurs if the value of the 
variable given in a reference is not an array or a table, or if the first argument of ITEM is 
not an array or table. An example is 

ITEM('A',3) 

4. NULL STRING IN ILLEGAL CONTEXT. This error occurs in a variety of situations. A 
typical one is 

A = $X 

where the value of X is the null string. 
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5. UNDEFINED FUNCTION OR OPERATION. This error occurs if an attempt is made to 
call a function or use an operator that has no definition. For example, 

A = X # y 

produces this error if # is not given a definition by using OPSYN. 

6. ERRONEOUS PROTOTYPE. This error occurs when ARRAY, DATA, DEFINE, or LOAD is 
called with a prototype that is syntactically erroneous. An example is 

define!'F(X') 

7. UNKNOWN KEYWORD. This error occurs if the unary operator S is applied to a string 
that is not a keyword. 

8. VARIABLE NOT PRESENT WHERE REQUIREp. Assignment and goto require variables. 
The following statement illustrates this error. 

3 = 2 

9. ENTRY POINT OF FUNCTION NOT LABEL. The entry point for a defined function 
given in DEFINE need not be a label, but must be by the time the defined function is 
first called. Otherwise this error occurs. 

10. ILLEGAL ARGUMENT TO PRIMITIVE FUNCTION. This error occurs when certain 
primitive functions are called with illegal arguments. Specific cases are in the first 
argument of ARG, FIELD, and LOCAL, the second argument of TRACE and STOPTR, and 
the third argument of OPSYN. 

11. READING ERROR. This error occurs when an error condition is encountered during an 
attempt to read a record. 

12. ILLEGAL l/O UNIT. If a negative data set reference number (unit) is given as an 
argument to BACKSPACE, ENDFILP, or REWIND, or as the second argument to INPUT or 
OUTPUT, this error occurs. 

13. LIMIT ON DEFINED DATA TYPES EXCEEDED. This limit on the number of different 
defined data types is 899. This error occurs if that limit is exceeded. 

14. NEGATIVE NUMBER IN ILLEGAL CONTEXT. Several functions require nonnegative 
integer arguments. A negative number causes this error. The specific cases are in the 
arguments to LEN, POS , RPOS , RTAB , and TAB, and in the third argument of INPUT. 

15. STRING OVERFLOW. This error occurs if a string longer than the value of 
SMAXLNGTH is formed. The initial, default value of £MAXLNGTH is 5000. 

16. OVERFLOW DURING PATTERN MATCHING. This error occurs if an internal stack 
overflows during pattern matching. Such overflow may be caused by a looping pattern, or 
excessively recursive one. 

17. ERROR IN SNOBOL4 SYSTEM. This error occurs if an internal error or inconsistency is 
detected by the SNOBOL4 system. Such an error indicates an implementation problem, 
and listings should be sent to the authors. 
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18. RETURN FROM ZERO LEVEL. This error occurs if a transfer to RETURN, PRETURN, or 
NRETURN is made outside the call of a programmer-defined function. 

19. FAILURE DURING GOTO EVALUATION. This error occurs if failure occurs in the 
process of computing a label for a transfer. An example is 

: ($(IDENT(X,Y) X Y)) 

if X and Y are not identical. 

20. INSUFFICIENT STORAGE TO CONTINUE. This error occurs when there is not enough 
storage available for continued operation of the SNOBOL4 system. 

21. STACK OVERFLOW. This error occurs if the internal stack used by the SNOBOL4 
system overflows. This error is usually the result of excessive recursion in programmer- 
defined functions. Stack overflow may also occur during storage regeneration. 

22. LIMIT ON STATEMENT EXECUTION EXCEEDED. This error occurs if the number of 
statements executed exceeds the value of 6STLIMIT. The initial default value of 
£STLIMIT is 50000. 

23. OBJECT EXCEEDS SIZE LIMIT. This error occurs if an attempt is made to create an 
object larger than the maximum possible size. On the IBM 360, this maximum size is 

bytes. 

24. UNDEFINED OR ERRONEOUS GOTO. This error occurs if an attempt is made to 
transfer to a label that does not occur in the program, or if the result of evaluating the 
goto field is not a natural variable. An example is 

:('LOOP' N) 

25. INCORRECT NUMBER OF ARGUMENTS. This error occurs if a primitive function is 
called with an incorrect number of arguments. In the case of explicit function calls, this 
error results only from too many arguments, since null strings are automatically provided 
for omitted trailing arguments. In the case of functions OPSYNed to primitive functions, or 
invoked by APPLY, this error is also caused by too few arguments. 

26. LIMIT ON COMPILATION ERRORS EXCEEDED. Up to 50 erroneous statements are 
permitted during compilation of a SNOBOL4 program. If that number is exceeded, this 
error occurs. Note that this error occurs during compilation and prevents program 
execution. 

27. ERRONEOUS END STATEMENT. This error occurs if there is a syntactic error in the 
end statement of a program, or if the label specified in the end statement does not occur 
in the program. This error prevents program execution. 

28. EXECUTION OP STATEMENT WITH A COMPILATION ERROR. This error occurs if an 
attempt is made to execute a statement in which an error was detected during 
compilation. 
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Two kinds of failure occur in pattern matching as a result of conditionally fatal errors. 
If Error 16 occurs, the entire pattern match fails, as if ABORT had been encountered. Any 
other error causes the pattern component involved to fail to match, as if FAIL had been 
encountered. 


10.3 T er mi nation 

There are three principal kinds of program termination: 

(1) normal, 

(2) error, and 

(3) cancellation. 


10.3.1 Normal Termination 


Normal termination occurs when the program transfers to END or flows into the end 
statement. Various information is printed at the end of the listing, including counts of 
certain program operations and timing information. If the value of SDUMP is nonzero at 
program termination, a listing of the values of natural variables and unprotected keywords 
is provided. Only natural variables with nonnull values are included in the listing. 

The following program illustrates the printout produced by a program that terminates 
normally. 

SNOBOL4 (VERSION 3.0, DEC. 17, 1969) 

BELL TELEPHONE LABORATORIES, INCORPORATED 


SDUMP = 1 } STRIM = 1 

♦ THIS PROGRAM IS THE ALGORITHM BY HAO WANG (CF. 'TOWARD 

♦ MECHANICAL MATHEMATICS', IBM JOURNAL OF RESEARCH AND 

♦ DEVELOPMENT 4(1) JAN 1960 PP.2-22.) FOR A PROOF-DECISION 

♦ PROCEDURE FOR THE PROPOSITIONAL CALCULUS. IT PRINTS OUT A 

♦ PROOF OR DISPROOF ACCORDING AS A GIVEN FORMULA IS A THEOREM 

♦ OR NOT. THE ALGORITHM USES SEQUENTS WHICH CONSIST OF TWO 

♦ LISTS OF FORMULAS SEPARATED BY AN ARROW (—*). INITIALLY, FOR 

♦ A GIVEN FORMULA F THE SEQUENT 

♦ 

♦ —« p 

* 

* IS FORMED. WANG HAS DEFINED RULES FOR SIMPLIFYING A FORMULA 

* IN A SEQUENT BY REMOVING THE MAIN. CONNECTIVE AND THEN 

* GENERATING A NEW SEQUENT OR SEQUENTS. THERE IS A TERMINAL 
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♦ TEST FOR A SEQUENT CONSISTING OF ONLY ATOMIC FORMULAS; 

« 

♦ A SEQUENT CONSISTING OF ONLY ATOMIC FORMULAS IS VALID IF 

* THE TWO LISTS OF FORMULAS HAVE A FORMULA IN COMMON. 

* 

* BY REPEATED APPLICATION OF THE RULES, ONE IS LED TO A SET OF 

* SEQUENTS CONSISTING OF ATOMIC FORMULAS. IF EACH ONE OF THESE 

* SEQUENTS IS VALID THEN SO IS THE ORIGINAL FORMULA. 

* 

* 

* 



UNOP = ' NOT' 



3 


BINOP = 'AND' 1 'IMP' 1 'OR' | 

'EQU' 


4 


FORMULA = ' ' UNOP . OP '(' BAL . 

PHI ' 

)’ 1 

5 

+ 

' ' BINOP . OP '(' BAL 

. PHI 

' , ' BAL . PSI ') ' 

5 

A 

ATOM = ( ' ' BAL ' ' ) . A 



6 


DEFINE('WANG(ANTE,CONSEQ)PHI,PSl') 



7 

READ 

EXP = INPUT 


;F(END) 

8 


OUTPUT = 



9 


OUTPUT = 'FORMULA; ' EXP 



10 

A 

OUTPUT = 



1 1 


WANG(,' ' EXP) 


;F(INVALID) 

12 


OUTPUT = 'VALID' 


;(READ) 

13 

INVALID 

A 

OUTPUT = 'NOT VALID' 


;(READ) 

14 

WANG 

OUTPUT = ANTE ' —♦ ' CONSEQ 



15 


ANTE FORMULA = 


;S( $('A' OP) ) 

16 


CONSEQ FORMULA = 


;S( $('C' OP) ) 

17 


ANTE = ANTE ' ' 



18 


CONSEQ = ' ' CONSEQ ' ' 



i9 

TEST 

ANTE ATOM = 


;F(FRETURN) 

20 

* 

CONSEQ A 


;S(RETURN)F(TEST) 

21 

* 

ANOT 

WANG(ANTE,CONSEQ ' ' PHI) 


;S(RETURN)F(FRETURN) 

22 

W 

AAND 

A 

WANG(ANTE ' ' PHI ' ' PSI,CONSEQ) 


; S ( RETURN ) F ( FRETURN ) 

23 

▼ 

AOR 

WANG(ANTE ' ' PHI,CONSEQ) 


;F(FRETURN) 

24 

* 

WANG(ANTE ' ' PSI,CONSEQ) 


; S ( RETURN ) F ( FRETURN ) 

25 

* 

AIMP 

WANG(ANTE ' ' PSI,CONSEQ) 


;F(FRETURN) 

26 


WANG(ANTE,CONSEQ ' ' PHI) 


; S ( RETURN ) F ( FRETURN ) 

27 


* 
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AEQU 

WANG(ANTE ' ' PHI ' ' PSI, 

CONSEQ) 

:F(FRETURN) 

28 

A 

WANG(ANTE,CONSEQ ' ' PHI ' 

' PSI) 

:S(RETURN)F(FRETURN) 

29 

CNOT 

WANG(ANTE ' ' PHI,CONSEQ) 


:S(RETURN)F(FRETURN) 

30 

CAND 

WANG(ANTE,CONSEQ ' ' PHI) 


:F(FRETURN) 

31 


WANG(ANTE,CONSEQ ' ' PSi) 


:S(RETURN)F(FRETURN) 

32 

COR 

WANG(ANTE,CONSEQ ' ' PHI ' 

' PSI) 

:S(RETURN)F(FRETURN) 

33 

CIMP 

* 

WANG(ANTE ' ' PHI,CONSEQ 

' ' PSI) 

:S(RETURN)F(FRETURN) 

34 

CEQU 

WANG(ANTE ' ' PHI,CONSEQ ' 

' PSI) 

:F(FRETURN) 

35 


WANG(ANTE ' ' PSI,CONSEQ ' 

' PHI) 

:S(RETURN)F(FRETURN) 

36 


END 37 

NO ERRORS DETECTED IN SOURCE PROGRAM 


FORMULA: IMP(NOT(OR(P,Q)),NOT(P)) 

—♦ IMP(NOT(OR(P,Q)),NOT(P)) 
NOT(OR(P,Q)) —♦ NOTCP) 

—♦ NOT(P) OR(P,Q) 

P —♦ OR(P,Q) 

P —* P Q 
VALID 

FORMULA: NOT(IMP(N0T(0R(P,Q)),NOT(P))) 

—* NOT(lMP(NOT(OR(P,Q)),NOT(P))) 
IMP(NOT(OR(P,Q)),NOT(P)) —♦ 

NOT(P) —♦ 

—♦ p 

NOT VALID 

FORMULA: IMP(AND(N0T(P),NOT(Q)),EQU(P,Q)) 

—♦ IMP(AND(NOT(P),NOT(Q)),EQU(P,Q)) 
AND(NOT(P),NOT(Q)) —♦ EQU(P,Q) 

NOT(P) NOT(Q) —♦ EQU(P,Q) 

NOT(Q) —♦ EQU(P,Q) P 
—♦ EQU(P,Q) P Q 
P —♦ P Q Q 
Q —♦ P Q P 
VALID 
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FORMULA: IMP(IMP(OR(P,Q),OR(P,R)),OR(P,IMP(Q,R))) 

—♦ IMP(IMP(OR(P,Q),OR(P,R)),OR(P,IMP(Q,R))) 
IMP(OR(P,Q),OR(P,R)) —♦ OR(P,IMP(Q,R)) 

OR(P,R) —♦ OR(P,IMP(Q,R)) 

P —♦ OR(P,IMP(Q,R)) 

P —♦ P IMP{Q,R) 

P Q —♦ P R 
R —♦ OR(P,IMP(Q,R)) 

R —♦ P IMP(Q,R) 

R Q —♦ P R 

—♦ OR(P,IMP(Q,R)) OR{P,Q) 

—♦ OR(P,Q) P IMP(Q,R) 

—♦ P IMP(Q,R) P Q 
Q —* P P Q R 
VALID 


NORMAL TERMINATION AT LEVEL 0 
LAST STATEMENT EXECUTED WAS 8 


DUMP OF VARIABLES AT TERMINATION 

NATURAL VARIABLES 

A = ’ Q ’ 

ABORT = PATTERN 
ARB = PATTERN 
ATOM = PATTERN 
BAL = PATTERN 
BINOP = PATTERN 

EXP = ' IMP (IMP (OR (P , Q) , OR (P , R) ) , OR {P, IMP (Q , R) ) ) ' 
PAIL = PATTERN 
FENCE = PATTERN 
FORMULA = PATTERN 

INPUT = 'IMP(IMP(OR(P,Q),OR(P,R)),OR(P,IMP(Q,R)))' 
OP = 'IMP' 

OUTPUT = 'VALID' 

REM = PATTERN 
SUCCEED = PATTERN 
UNOP = 'NOT' 
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UNPROTECTED KEYWORDS 

SABEND = 0 
6ANCHOR = 0 
6CODE = 0 
6DUMP = 1 
6ERRLIMIT = 0 
6FTRACE = 0 
6FULLSCAN = 0 
£INPUT = 1 
£MAXLNGTH = 5000 
SOUTPUT = 1 
SSTLIMIT = 50000 
£TRACE = 0 
£TRIM = 1 


SNOBOL4 STATISTICS SUMMARY- 

1731 MS. COMPILATION TIME 
349 MS. EXECUTION TIME 

163 STATEMENTS EXECUTED, 34 FAILED 

0 ARITHMETIC OPERATIONS PERFORMED 
63 PATTERN MATCHES PERFORMED 
0 REGENERATIONS OF DYNAMIC STORAGE 
5 READS PERFORMED 
45 WRITES PERFORMED 

2.14 MS. AVERAGE PER STATEMENT EXECUTED 


10.3.2 Error Termination 

Error termination occurs if an unconditionally fatal error occurs, or if a conditionally 
fatal error occurs when £ERRLIMIT is zero. A message identifying the error is printed, 
and then statistics and dumps are provided as for normal termination. 

The following example illustrates error termination. In this example, the input data was 
removed, causing failure in statement 3 . The resulting attempt to transfer to ERR is 
erroneous since ERR does not occur as a label in the program. 

SNOBOL4 (VERSION 3.0, DEC. 17, 1969) 

BELL TELEPHONE LABORATORIES, INCORPORATED 



£ANCHOR 

= 1 



CARDPAT 

= break!' ') . BIN LEN(1) BREAK!' 

') . NUMBER 

* 

THE FIRST 

CARD GIVES THE NUMBER OF BINS 



SUM = 

ARRAY!TRIM!INPUT),0) 

:F!ERR) 
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STRACE = 1000 4 

TRACE(.SUM<3>,'VALUE','SUM<3>') 5 

* SUBSEQENT CARDS CONTAIN A BIN NUMBER FOLLOWED BY A BLANK AND THEN 

* THE NUMBER TO BE ADDED TO THE BIN. 

READ CARD = INPUT :F(DISPLAY) 6 

CARD CARDPAT :F(ERR) 7 

SUM<BIN> = SUM<BIN> + NUMBER :S(READ)F(ERR) 8 

* priHt out the sums 


DISPLAY 9 

1 = 1 1C 

PRINT OUTPUT = 'SUM<' I ’> = ' SUM<I> :F(END) 11 

I = 1+1 ;(PRINT) 15 

END 13 


NO ERRORS DETECTED IN SOURCE PROGRAM 


ERROR 24 IN STATEMENT 3 AT LEVEL 0 
UNDEFINED OR ERRONEOUS GOTO 


SNOBOL4 STATISTICS SUMMARY- 

1148 MS. COMPILATION TIME 
50 MS. EXECUTION TIME 
3 STATEMENTS EXECUTED, 1 FAILED 

0 ARITHMETIC OPERATIONS PERFORMED 
0 PATTERN MATCHES PERFORMED 
0 REGENERATIONS OF DYNAMIC STORAGE 
1 READS PERFORMED 
0 WRITES PERFORMED 

16.67 MS. AVERAGE PER STATEMENT EXECUTED 

Another form of error termination occurs when an error is detected by the FORTRAN 
I/O routines. In this case, an error message is printed by the I/O routines, followed by 
the message CUT BY SYSTEM. 

This type of error is illustrated by the following program in which no failure exit is 
provided on the statement reading data. Eventually data is exhausted, and a subsequent 
attempt to read causes the I/O routines to attempt to read from the second file for data 
set reference number 5. This file does not exist, and on the IBM 360 the error message 
IHC219I is printed. 
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SN0B0L4 (VERSION 3.0, DEC. 17, 1969) 

BELL TELEPHONE LABORATORIES, INCORPORATED 



STRIM 

= 1 


1 


FINDNO 

= BREAK(♦NO) . NO 


2 


ONE = 

LEN(1) . NO 


3 


LO 

POS(O) SPAN('O') 


4 

* 

DEFINE('DEHEX(STR)NO') 

:(DEHEX.END) 

5 

* 

DEHEX 

STR LO 



6 

DEHEX1 

STR ONE = 

;F(RETURN) 

7 


DEHEX = 

INTEGER (NO) 16 ♦ DEHEX -I- NO 

:S(DEHEX1) 

8 


'ABCDEF' 

FINDNO 

:F(FRETURN) 

9 


DEHEX = 

16 ♦ DEHEX -1- 10 SIZE(NO) 

:(DEHEX1) 

10 

DEHEX. 

* 

END 



11 

* 

READ 

NUMBER 

= INPUT 


12 


OUTPUT 



13 


OUTPUT 

= 'DEHEX(' NUMBER ') = ' DEHEX(NUMBER) 

13 

+ 



:S(READ) 

13 


OUTPUT 

= 'UNABLE TO CONVERT ' NUMBER 

: (READ) 

14 

END 




15 


NO ERRORS DETECTED IN SOURCE PROGRAM 


DEHEX(100) = 256 
DEHEX(OOOII) = 17 
DEHEX(OOOF) = 15 
UNABLE TO CONVERT 1ABCG 
DEHEX(IABC) = 6844 
DEHEX(OOOPACE) = 64206 
DEHEX(OOOFACE) = 64206 


IHC219I 


CUT BY SYSTEM IN STATEMENT 12 AT LEVEL 0 
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SNOBOL4 STATISTICS SUMMARY- 

1048 MS. COMPILATION TIME 
216 MS. EXECUTION TIME 

111 STATEMENTS EXECUTED, 28 FAILED 

59 ARITHMETIC OPERATIONS PERFORMED 
52 PATTERN MATCHES PERFORMED 
0 REGENERATIONS OF DYNAMIC STORAGE 
8 READS PERFORMED 
7 WRITES PERFORMED 

1.95 MS. AVERAGE PER STATEMENT EXECUTED 


The message CUT BY SYSTEM may also occur if an error in the SNOBOL4 system 
causes a program interrupt. 


10.3.3 Cancellation Termination 

On the IBM 360, job cancellation prevents the SNOBOL4 system from regaining 
control. A system completion code is given, but no further output is printed. Statistics and 
dumps are lost, and it is not possible, in general, to determine where the program was 
when it was cancelled. In fact, program output may not be completely printed because of 
buffered output. 

A typical cause for cancellation is a program loop, causing execution to exceed specified 
limits. Since some printed output may be lost, the program listing may be misleading. It is 
possible, for example, for a program to be cancelled in an execution loop, even though the 
compilation listing is incomplete, giving the erroneous impression of cancellation during 
compilation. 




CHAPTER 11 


Programming Details and Storage Management 


This chapter is concerned with efficient programming techniques and good utilization of 
storage. The material that follows is intended for experienced programmers and 
individuals who have difficulties in running SNOBOL4 programs that are very large and 
time-consuming. 

The most efficient programming and best use of the SNOBOL4 language requires 
considerable experience and sophistication. A student just learning the language should not 
attach too much importance to the material in this chapter. The various interdictions and 
comparisons will only obscure the underlying structure of SNOBOL4. 


11.1 Implementation Overview 

The key to efficient use of SNOBOL4 is an understanding of its implementation. The 
implementation is in itself a very large and involved topic. Only the most superficial 
description is attempted here. 

The SNOBOL4 system consists of three main parts: A compiler, an interpreter, and a 
storage allocator. 

The compiler analyzes the source program and converts it to a form of object code. In 
the analysis, each language element is converted into a suitable internal representation 
consisting of numbers, pointers, and flags. The object code is put in a Polish prefix form 
suitable for interpretation. 

During the execution phase, the compiled object code is interpreted; functions are 
called, arguments fetched, and the resulting language operations are performed. Because of 
the nature of SNOBOL4, procedures are implemented in a recursive way. 

Storage allocation occurs frequently during both compilation and interpretation. Strings 
identified by the compiler are entered into a storage area. During execution, storage is also 
allocated for new strings, patterns, arrays, and so forth. SNOBOL4 is unusual in this 
sense. In most languages, data objects are identified by declarations handled during 
compilation and hence little, if any, allocation is required during execution. Allocation of 
new data objects typically continues throughout execution of a SNOBOL4 program. As a 
result of continued allocation, available storage may be Exhausted. Storage regeneration. 
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which discards old, unneeded objects to make more space available, is therefore an 
important function of the storage allocator. 


11.2 Strings 

Handling strings is fundamental to the operation of the language. SNOBOL4 supports 
operations on strings of essentially unlimited length, without requiring any attention on 
the part of the programmer. This is particularly important in string formation where 
strings of any length can be assigned to any variable. 

More important to the internal operation of the language is the fact that any nonnull 
string can be used as a variable. Some natural variables are explicit in the source 
program. Indirect referencing permits any string to be used as a variable, and hence have 
an assigned value. The statement 

$INPUT = X 

is an extreme example in which a string is read in and used as a variable. 

To support the use of any nonnull string as a variable, every newly formed string is 
created as a variable. Space is provided for the value of that variable, and the null string 
is provided as the initial value. Proper functioning of the language requires that those 
natural variables be unique. Otherwise two different copies of the variable might have 
different values. In SNOBOL4 these problems are handled by assuring that each string 
occurs exactly once in storage. When a new string is formed, a search must be performed 
to determine whether that string already exists. Since string formation is a frequent 
operation, special techniques are used to make the search efficient. A combination of hash 
addressing [10,11] and a linked-bin structure is used. Each string, once it is formed, is 
uniquely represented by a pointer (address). The value of a natural variable is obtained 
by referencing the location to which it points. In the statement 

X = Y Z 

the variables X , Y , and Z are represented by pointers. Values of Y and Z are easily 
obtained. Assuming these values are strings, concatenation is performed, resulting in a new 
string. A search for this string is then performed. If it already exists as a variable, a 
pointer to it is returned. If the string is new, space is allocated for it, and a pointer to the 
new variable is returned. Assignment of this pointer as the value of X is now simple. This 
example illustrates a general characteristic of strings. Obtaining and assigning values is a 
simple operation. Creation of a new string is relatively more time consuming. 

From the discussion, it should be evident that indirect referencing is a relatively efficient 
op>eration. Since natural variables are represented by pointers, an indirect reference only 
requires going from a pointer to the value to which it points. 



11.4 


Patterns and Pattern Matching 193 


11.3 Other Variables 

Strings are only one kind of variable. Others are array and table references, fields of 
programmer-defined data objects, and keywords. 

To reference an array, the variable is computed from the given indices. This operation 
is relatively simple and efficient. A similar operation is required for fields. 

Table references are more complicated, and require a search of the table for the 
referencing item. This search is linear, but only requires comparison of pointer-type 
quantities. No string comparison is required since strings are represented by pointers. The 
time required to reference a table depends on the size of the table and the location of the 
referencing item within the table. 

A typical alternative to tables is indirect referencing. The statements 
T<N> = X 

and 

$('L' N) = X 

are representative of the two choices. Generally speaking, the two take similar amounts of 
time. The time spent searching the table T is similar to the time taken to perform the 
concatenation and locate the corresponding natural variable. The space required for many 
natural variables with the prefix L is much greater than the space required for the table 
T, however. 

On the other hand, if no prefix is required, 

$N = X 

is more efficient than the table reference. Of course tables can be referenced with objects 
of any data type, while indirect referencing is limited to strings. Another consideration is 
whether a list of referenced items must be kept if indirect referencing is used. Conversion 
from TABLE to ARRAY provides such a list in a simple, efficient manner if tables are used. 


Keyword references also require a linear search through a small table. Since keyword 
references are infrequent and the search small, keywords can be considered on a par with 
natural variables. 


11.4 Patterns and Pattern Matching 

A pattern is a data object, and may be very large if the pattern is complicated. The 
construction of a pattern is an involved process. For these reasons, it is more efficient to 
construct patterns outside of pattern matching statements, rather than to construct them 
repetitiously with each execution of the pattern matching statements. Unevaluated 
expressions make it possible to avoid construction of patterns in pattern matching 
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statements, even if components of the pattern vary. An unevaluated expression amounts to 
a pointer to another pattern, which is accessed during pattern matching when the 
unevaluated expression is encountered. 

Chapter 2 describes pattern matching in some detail, and much information about 
efficiency can be obtained by considering the bead diagrams and operation of the scanner. 
Several simple rules are worth remembering: 

(1) Use the anchored mode if possible. 

(2) Use BREAK instead of ARB if possible. 

(3) Use ANY instead of alternation if possible. 

(4) Avoid use of ARBNO. 

(5) Use conditional rather than immediate value assignment if possible. 

(6) Use the quickscan mode if possible. 

Many programs can execute entirely in the anchored mode, avoiding much useless, 
time-consuming searching during pattern matching. In the unanchored mode, individual 
patterns can be anchored by using FENCE or POS ( 0 ) as their first component. 

Where a known delimiter is sought, BREAK should be used. ARB , as its bead diagram 
illustrates, matches each character individually, which may be very inefficient. 

Where one of a number of single characters is sought, ANY should be used. Alternation 
requires a large pattern with many components that must be matched one at a time. The 
time required for ANY to match is independent of the order in which the characters in its 
argument appear. In alternation, as indicated by the bead diagrams, the time depends 
strongly on the order of the alternatives. 

ARBNO should be used only where its function is really needed. SPAN often will do, and 
is much more efficient. 

Immediate value assignment often performs several assignments in the course of pattern 
matching. These additional operations should be avoided unless the dynamic effects of 
immediate assignment are really needed. 

As illustrated in Chapter 2, the quickscan mode is designed to improve the efficiency of 
pattern matching by avoiding alternatives that cannot lead to successful matching. The 
fullscan mode, on the other hand, permits attempts to match, however futile they may be. 
For this reason, the fullscan mode may cause pattern matching to take much longer than 
the quickscan mode. The fullscan mode should be used only when pattern matching 
involves dynamic effects resulting from unevaluated expressions or immediate value 
assignment. If fullscan is needed only in a few statements, it is worth changing modes for 
these statements rather than running the entire program in the fullscan mode. 
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11.5 Input and Output 

Automatic input and output occurs because of variables associated with data sets. If 
6OUTPUT is nonzero, output association is checked by a table lookup for every assignment 
of value to every variable. The time required to determine whether a particular variable is 
output-associated depends on the pumber of output-associated variables. This checking can 
be bypassed entirely by setting the value of 6 OUTPUT to zero during periods of program 
execution where output is not required. Input is handled in a similar manner, although 
checking is usually more frequent because it is required for every use of every variable. 
£ INPUT can also be set to zero if input is not required. 

Trimming of trailing blanks by setting 6TRIM to a nonzero value has the benefit of 
creating shorter strings on input. Space is not allocated for the trailing blanks. Automatic 
trimming on input should be used, where appropriate, rather than explicit application of 
the TRIM function. TRIM applied to INPUT results in two strings: one consisting of the 
full input line, and another consisting of the trimmed result. 


11.6 Storage Management 

Dynamic storage is continually allocated during program compilation and execution. All 
forms of programmer data reside in allocated storage and compete for available space. 
This includes object code, strings, patterns, arrays, and so forth. Some data, depending on 
its use, is transient and may be discarded. Other data is always accessible to program and 
must be kept. 

When available space is exhausted, the storage allocator regenerates storage, collecting 
all needed data and discarding all data inaccessible to the program. This process occurs 
automatically and usually does not concern the programmer directly. Statistics indicating 
a large number of regenerations suggest potential trouble. Repetitious construction of 
patterns and manipulation of very long strings are the most common causes of frequent 
storage regeneration. The amount of time required to regenerate storage depends on many 
factors and no specific figures can be given. A rough rule of thumb is that storage 
regeneration takes about 20 to 100 times as long as the execution of a typical statement. 
Thus storage regeneration is not a factor in program efficiency unless it occurs frequently. 


11.6.1 Forcing Storage Regeneration 

In special circumstances, a programmer may want to force storage regeneration or find 
out how much storage is available. This is done with the function COLLECT , which forces 
storage regeneration. COLLECT returns as value the amount of storage available (in bytes 
on the IBM 360) after regeneration. COLLECT (N) fails if less than N bytes remain. 
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11.6.2 Clearing Variable Values 

Some programs are organized to process several sections of data independently, making 
it possible or necessary to remove residual data between sections. The function CLEAR 
assists in this manner. A call of CLEAR ( ) sets the values of all natural variables to the 
null string. CLEAR does not affect the values of keywords, created variables, I/O 
associations, or function definitions. Furthermore, values are cleared only at the level at 
which CLEAR is called. If values are saved by a call to a programmer-defined function, 
these values are restored when the function returns. This fact may be used to save the 
values of selected variables by calling CLEAR in a function in which these variables are 
formal arguments. If the values of X, Y, Z, and PAT are to be saved, a function RESET 
could have the defining statement 

define!'RESET(X,Y,Z,PAT)') 
with the procedure 

RESET clear!) :!RETURN) 

The values ofX, Y, Z, and PAT (and RESET) are saved when RESET is called, and 
restored when it returns. All other natural variables are cleared. 

CLEAR applies to the natural variables ABORT, ARB, PAIL, PENCE, REM, and SUCCEED, 
and clears their primitive pattern values. The corresponding keywords SABORT, £ARB, 
SFAIL, SPENCE, 6REM, and SSUCCEED, which are not affected by CLEAR, can be used 
to restore primitive patterns. 
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Syntax of SNOBOL4 


This formal description of the syntax of SNOBOL4 is given in a syntax notation used 
in many IBM manuals [10]. Rules explaining this notation follow. 

(1) A class of elements is denoted by a notation variable, which consists of lower 
case letters and periods and must begin with a letter. 

(2) Literal characters are denoted by capital letters or special characters. Lower case 
letters and syntactic symbols are underlined when they represent literals. A lone 
underscore stands for itself. 

(3) A syntactic unit is defined as one of the following: 

a. a notation variable, 

b. literal characters, or 

c. any collection of variables, literals, and syntax notation surrounded by braces 
or brackets. 

(4) Braces 1 1 denote a grouping. 

(5) Square brackets I 1 denote an option. Anything enclosed within brackets may 
appear or be omitted. 

(6) Vertical stacking of syntactic units and the vertical stroke | denote alternatives. 

(7) Three dots . . . denote optional repetition of the immediately preceding syntactic 
unit one or more times. 

(8) Footnotes are used where restrictions apply to notation variables. 
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A,1 Syntax of Statements 

The following notation variables define the components of a statement, leading to the 
definition of a statement itself. 

digit: 0|1|2|3|4|5|6 | 7|8|9 

letter: a|b|c|d|b|f|g|h|i|j|k|l|m|n|o|p|q|r|s|t|u|v|w|x|y|z 

alphanumeric: letter{digit 

identifier: letter [alphanumeric . 

blanks: one or more blank characters 

integer: digit (digitl... 

real: integer . 1 integer 1 

operator: -'|?|$|^|l|3*|*|/|#| + |- |a|_L|6 

unary: operator 

binary: blanks [(operator!**] blanks! 

string: zero or more EBCDIC characters 

sliteral: '(string)'' 

dliteral: "(string}'*" 

literal: sliteralj dliteraljinteger j real 

element: (unary!... /^identifier 

literal 

( function.call S 
reference 
expression 

operation: element binary (element|expression) 

expression: (blanks! (elementjoperation] (blanks! 
arg.list: expression (, expression!... 

function.call: identifier ( arg.list ) 
reference: identifier < arg.list > 

label: (alphanumeric (string)M" 


subject.field: blanks element 
pattern.field: blanks expression 
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object.field: blanks expression 

equal: blanks = 

goto: {( expression )|< expression >) 

goto.field: blanks j_ [blanks] ^goto 

S goto [blanks] [F goto] 

V.F goto [blanks] [S goto] 

eol: end of line 

eos: [blanks] (;|eol1 

assign.statement: [label] subject.field equal [object.field] 

[goto.field] eos 

match.statement: [label] subject.field pattern.field [goto.field] eos 

repl.statement: [label] subject.field pattern.field equal 

[object.field][goto.field] eos 

degen.statement: [label] [subject.field] [goto.field] eos 

end.statement: END [blanks [label|END]] eos 

statement: assign.statement|match.statement|repl.statement| 

degen.statement|end.statement 

The following notes apply to the syntax given above: 

’ not including a single quote ' 

^ not including a double quote " 

^ not including a blank or semicolon 

“ but not END. ( RETURN, FRETURN, and NRETURN are syntactically acceptable, but 
do not override the standard function returns.) 
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A SNOBOL4 program consists of a sequence of statements terminating with an end 
statement. Interspersed among these statements may be comment lines and control lines. 


comment.line: 
control.line: 


♦ string eol 

- [blanks] TLIST blanks [ LEFT 1 RIGHT] 
< UNLIST 
LEJECT 


[blanks] eol' 
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A statement begins immediately following the preceding statement, i.e. at the beginning 
of a line or following a semicolon. A statement may be continued on the next line by 
using a continue line. 

continue.line: 1+1^1 remainder of statement 

Comment, control, and continue lines must begin at the beginning of a line. They may 
not start in the interior of a line following a semicolon. A statement may be broken over a 
line boundary anywhere a blank is optional in the syntax. If a statement has the form 

parti [blanks! part2 

it may be continued as 

parti [blanks! eol 
{ + [blanks] part2 

where the + or . begins a new line as indicated, and takes the place of the optional 
blank. 


A3 Syntax of Prototypes 

Prototypes for arrays, programmer-defined functions, programmer-defined data types, and 
external functions are evaluated during program execution, not during compilation. These 
prototypes may be given explicitly as literals or may be computed in a variety of ways. 
When ARRAY, DEFINE, DATA, or LOAD is executed, the corresponding prototype is then 
analyzed. The syntax of these prototypes follows. 

item: [identifier! 

item.list: item 1, item!... 

data.prototype; identifier ( item.list ) 

function.prototype; identifier ( item.list ) item.list 

external.prototype; identifier ( item.list ) item 

signed.integer: [[+|-] integer! 

dimension: signed.integer [j_ signed.integer 1 

array.prototype: dimension [, dimension]... 
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A.4 Syntax of Data Type Conversions 

Conversion from STRING to another data type involves a syntactic analysis of the string 
to be converted. Explicit conversions are performed to INTEGER, REAL, EXPRESSION, 
and CODE . Implicit conversions are performed for INTEGER and REAL. 

string.integer: signed.integer 
string.real: 1 (-l-1 - 1 real 1 

string.expression: expression 

string.code: statement 1; statement!... 


A.5 Character Codes for Various Machines 

The character codes and graphics for symbols used in SNOBOL4 vary from machine to 
machine. The graphics used in this book correspond to the IBM 360 implementation. The 
following table summarizes the card codes and graphics for other implementations, using 
the IBM 360 as the point of reference. Not included in this table are the RCA Spectra 70 
and XDS Sigma 5/6/7 which have the same codes and graphics as the IBM 360. The 
PDF-10 characters are similar to the IBM 360 as noted below. Punched card codes and 
typical printer graphics are given. Internal codes vary, and communication terminals may 
have different graphics. 

IBM 360 CDC 6600 UNIVAC 1108 GE 635 


9 

o 

1 

U) 

1 

oo 

9 

0-3-8 

9 

0-3-8 

9 

0-3-8 

• 

• 

2-8 

• 

• 

2-8 

• 

• 

5-8 

• 

• 

5-8 

• 

9 

6-8-11 

• 

9 

7-8-12 

• 

9 

6-8-11 

• 

9 

6-8-11 

If 

7-8 


4-8 


7-8-11 

ft 

0-6-8 

1 

5-8 

t 

5-8-11 

f 

4-8 

1 

7-8-11 

( 

5-8-12 

( 

0-4-8 

( 

0-4-8 

( 

5-8-12 

) 

5-8-11 

) 

4-8-12 

) 

4-8-12 

) 

5-8-11 

< 

4-8-12 

[ 

7-8 

1 

5-8-12 

[ 

2-8 

> 

0-6-8 

1 

0-2-8 

1 

5-8-11 

1 

4-8-12 

= 

6-8 

= 

3-8 

= 

3-8 

= 

0-5-8 

—I 

7-8-11 


6-8-12 

\ 

0-6-8 

\ 

7-8-12 

• 

0-7-8 


0-5-8 

p 

0-12 

p 

7-8 

$ 

3-8-11 

$ 

3-8-11 

$ 

3-8-11 

$ 

3-8-11 
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IBM 360 

3-8-12 

• 

CDC 6600 

3-8-12 

UNIVAC 1108 

3-8-12 

• 

GE 635 

3-8-12 

1 

• 

% 

2-8-11 

0-4-8 

> 

none 

5-8-12 

% 

none 

0-5-8 

% 

none 

0-4-8 

* 

4-8-11 

* 

4-8-11 

* 

4-8-11 

* 

4-8-11 

/ 

0-1 

/ 

0-1 

/ 

0-1 

/ 

0-1 

# 

3-8 

< 

5-8 

# 

7-8-12 

# 

3-8 

+ 

6-8-12 

+ 

12 

+ 

12 

+ 

0-12 

- 

11 

- 

1 1 

- 

1 1 

- 

11 

a 

4-8 

i 

6-8-11 

a 

7-8 

a 

4-8 

1 

7-8-12 

V 

0-11 

! 

0-11 

1 

• 

0-7-8 

£ 

12 

A 

0-7-8 

£ 

2-8 

£ 

12 

Notes: 

1. The 

PDF-10 has the 

same character set as 

the IBM 

360, except ! 

is equivalent to | 


and \ (card code 7-8) is equivalent to -i. 

2. The CDC 3600 and the rest of the CDC 6000 series have the same character set as the 
CDC 6600. Card code 2-8-11 can also be used for V . 


A.6 Extended Syntax for the IBM 360 Implementation 

On the IBM 360 implementation, alternatives for several syntactically significant 
characters are provided. The extended syntax permits the use of characters available from 
terminal input and for which there are graphics on some print trains. The following table 
illustrates the added characters available in addition to the standard ones. Lower-case 
letters are also available on the IBM 360 implementation. 


Added 

Hexadecimal Code 

Language Use 

Standard 

tab 

05 

concatenation, separator 

blank 

1 

AD 

left bracket and 

left direct goto delimiter 

< 

1 

BD 

right bracket and 
right direct goto delimiter 

> 


4A 

assignment 

= 

t 

9F 

exponentiation 

1 

r 

AC 

comment line 

* 

1 

4F 

comment line 

* 

L 

AB 

comment line 

* 
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Notes: 

1. The tab character cannot be used in OPSYN to represent concatenation, and is not 
treated like a blank in trimming. 

2. The PDP-10 implementation permits the use of I , 1 , tab, •^, and t in the same way as 
the extended IBM syntax. 




APPENDIX B 


Versions 2 and 3 of SNOBOL4 


The first edition of this book describes Version 2 of SNOBOL4. This edition describes 
Version 3, which became available in December of 1969. Differences between the two 
versions are significant, and are detailed in the following sections. 


B. 1 Running Version 2 Programs under Version 3 

Since Version 3 supercedes Version 2, the most serious differences are those that require 
changes in Version 2 programs to run them under Version 3. These differences are minor, 
and almost all programs written to run under Version 2 will run properly under Version 
3 without modification. Points of difference follow. 

1. In Version 2, negative values of &FTRACE and £ TRACE turn on the tracing modes. In 
Version 3, the tracing modes are only on for positive values of these keywords. 

2. An improvement in pattern matching heuristics for patterns involving LEN may be 
detectable during pattern matching in the quickscan mode. Fewer alternatives are 
attempted in Version 3. 

3. Data type checking in the pattern-valued functions causes detection of erroneous 
arguments when the functions are called. In Version 2 such errors are not detected until 
pattern matching takes place. 

4. The CONVERT function now performs conversions to STRING and INTEGER that fail in 
Version 2. 

5. The DATATYPE function does not fail on external data types in Version 3 as it does in 
Version 2. 

6. Negative arguments to COLLECT are erroneous in Version 3, but not in Version 2. 

7. The standard print line length in Version 3 is 132, as opposed to 131 in Version 2. 

8. Several constructions that are erroneous in Version 2 are acceptable in Version 3. 
Mixed-mode.arithmetic is typical. 

9. Program listing formats and messages are somewhat different in the two versions. 
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10. Program timing typically varies between the two versions. 

11. On the IBM 360, handling of program interrupts, FORTRAN I/O error conditions, 
and step termination differs in the two versions. 


B.2 Running Version 3 Programs under Version 2 

Version 3 contains a number of features not available in Version 2. For this reason, many 
Version 3 programs will not run under Version 2 without considerable modification. 
Principal differences follow. 

1. Version 2 does not have tables, or any of the features relating to them. 

2. Version 2 does not have error control. All errors are fatal. 

3. Version 2 does not permit mixed-mode arithmetic, comparison of real numbers, 
automatic conversion to and from REAL, or REAL -to-INTEGER conversion. 

4. Version 2 does not support exponentiation of reals. 

5. Version 2 does not permit operator definition byOPSYN. The unary operators ! and | 
are lacking, as well as the binary operators -i , ? , $, and I . 

6. A string to be converted to CODE must end with a semicolon. 

7. The following functions are not included in Version 2: DUMP, DUPL, REMDR, and 
TABLE. 

8. The following keywords are not included in Version 2: 8CODE, SERRLIMIT, 
6ERRTYPE,6INPUT, &0UTPUT, and £TRIM. 

There are many other minor differences, mainly relating to data type conversions. 
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Sample Programs 


Sample Program 1 


SN0B0L4 (VERSION 3.0, DEC. 17, 1969) 

BELL TELEPHONE LABORATORIES, INCORPORATED 


« 

* THIS PROGRAM USES ARRAYS, PROGRAMMER-DEFINED FUNCTIONS, AND 

♦ A VARIETY OF OUTPUT FORMATS TO PRODUCE SETS OF BRIDGE HANDS. 

♦ THE FUNCTION DEALO USES A PSEUDO-RANDOM NUMBER GENERATOR (13l 

* TO DEAL CARDS INTO FOUR ARRAYS NORTH, EAST, SOUTH, AND WEST. 

♦ THE FUNCTION DISPLAY() PRINTS THE HANDS, ONE TO A PAGE. 

« 

OUTPUT('TITLE',6,'(14H1THIS IS HAND ,110A1)') 

OUTPUT('DEALER',6,'(11H DEALER IS ,110A1)') 

OUTPUT('SKIP',6,'(A1)') 

* 

♦ FUNCTIONS 

* 

DEFINE('DEAL()') 

DEFINE('DISPLAY()') 

DEFINE('LINE(STR1,COL 1,STR2,COL2)BL1,BL2') 

DEFINE('RANDOM(N)') 

DEFINE('SUITL(HAND,SUIT)N') :(CONSTANT) 

* 

DEAL DEALSEQ DEALHAND 

DECK = COPY(NEWDECK) 

N = 51 

NLOOP DEALSEQ NXTHAND 

CARD = RANDOM(N + 1) 

DECK<CARD> SUITRANK 
ITEM ( $HAND, SUIT, RANK) 

N = GT(N,0) N - 1 

* 

DISPLAY TITLE = NTHDEAL 


= NE(CARD,N) DECK<N> 

= $RANK 

:S(NLOOP)F(RETURN) 


1 

2 

3 


4 

5 

6 

7 

8 

9 

10 
11 
12 

13 

14 

15 

16 

17 
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DEALER 

= 

DEALR 

18 


SKIP = 

1 

1 

19 


OUTPUT 

= 

LINE('NORTH',40) 

20 


OUTPUT 

= 


21 


OUTPUT 

= 

LINE(SUITL(NORTH,'S'),40) 

22 


OUTPUT 

= 

LINE(SUITL(NORTH,'H'),40) 

23 


OUTPUT 

= 

LINE(SUITL(NORTH,'D'),40) 

24 


OUTPUT 

= 

LINE(SUITL(NORTH,'C),40) 

25 


SKIP = 

1 

1 

26 


OUTPUT 

= 

LINE('WEST',20,'EAST',60) 

27 


OUTPUT 

= 


28 


OUTPUT 

= 

LINE(SUITL(WEST,'S'),20, 

29 

+ 



SUITL(EAST,'S'),60) 

29 


OUTPUT 

= 

LINE(SUITL(WEST,'H'),20, 

30 

+ 



SUITL(EAST,'H'),60) 

30 


OUTPUT 

= 

LINE(SUITL(WEST,'D'),20, 

31 

+ 



SUITL(EAST,'D'),60) 

31 


OUTPUT 

= 

LINE(SUITL(WEST,'C),20, 

32 

+ 



SUITL(EAST,'C),60) 

32 


SKIP = 

1 

1 

33 


OUTPUT 

= 

LINE('SOUTH',40) 

34 


OUTPUT 

= 


35 


OUTPUT 

= 

LINE(SUITL(SOUTH,'S'),40) 

36 


OUTPUT 

= 

LINE(SUITL(SOUTH,'H'),40) 

37 


OUTPUT 

= 

LINE(SUITL(SOUTH,'D'),40) 

38 


OUTPUT 

= 

LINE(SUITL(SOUTH,'C),40) 

39 

+ 

* 



;(RETURN) 

39 

* 

LINE 

BL LEN 

(COL1 - 1) . BL1 

40 


BL2 = 

DIFFER(STR2) DUPLC ',C0L2 - (COLl + SIZE(STRI))) 

41 

LINE1 

* 

LINE = 

DUPLC ',C0L1 - 1) STR1 BL2 STR2 :(RETURN) 

42 

* 

RANDOM 

RAN.VAR 

- 

= RAN.VAR ♦ 1061 + 3251 

43 


RAN.VAR 

RTAB(5) = 

44 

* 

RANDOM 

— 

(RAN.VAR ♦ N) / 100000 :(RETURN) 

45 

* 

SUITL 

SUITL 

— 

SUIT ' 

46 

SUITL1 

SUITL 

= 

SUITL HAND<$SUIT,N> ;F(RETURN) 

47 


N = N 

+ 

1 ;(SUITL1) 

48 

♦ 

♦ 

A 



CONSTANTS 


CONSTANT 

S = 1 



49 


H = 2 



50 


D = 3 



51 
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C = 4 


52 


$0 = 'A' 


53 


$1 = 'K' 


54 


$2 = 'Q' 


55 


$3 = 'J' 


56 


$4 = '10' 


57 


$5 = '9' 


58 


00 

II 

u> 


59 


II 


60 


$8 = '6' 


61 


$9 = '5' 


62 


$10 = '4' 


63 


$11 = '3' 


64 


$12 = '2' 


65 


DEALSEQ = 'NORTH,EAST,SOUTH,WEST,NORTH,' 


66 


NXTHAND = ♦HAND ',' BREAK(',') . HAND 


67 


DEALHAND = ♦DEALR ',' BREAK(',') . HAND 

. DEALR 

68 


SUITRANK = LEN(1) . SUIT REM . RANK 


69 


EMPTYHAND = ARRAY('4,0:12') 


70 


NEWDECK = ARRAY('0:51') 


71 


RAN.VAR = 157 


72 


DEALMAX = 3 


73 


NTHDEAL = 


74 


DEALR = 'WEST' 


75 


N = 0 


76 

BLDDEK 

1 = 1+1 


77 


R = 0 


78 

BLDDK1 

NEWDECK<N> = I R 


79 


N = LT(N,51) N + 1 

:F(NEWDEAL) 

80 

* 

R = LT(R,12) R+1 

:S(BLDDK1)F(BLDDEK) 

81 

* 

NEWDEAL 

NTHDEAL = LT(NTHDEAL,DEALMAX) NTHDEAL + 1 

1 :F(END) 

82 


NORTH = COPY(EMPTYHAND) 


83 


EAST = COPY(EMPTYHAND) 


84 


SOUTH = COPY(EMPTYHAND) 


85 

A 

WEST = COPY(EMPTYHAND) 


86 


DEAL() 


87 


DISPLAY() 

:(NEWDEAL) 

88 

END 



89 


NO ERRORS DETECTED IN SOURCE PROGRAM 
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THIS IS HAND 1 
DEALER IS NORTH 


NORTH 

S AQ974 
H K5 
D AJ93 
C Q8 


WEST 


EAST 


S J83 
H A108 
D 872 
C K1042 


S 62 
H Q9732 
D K64 
C J96 


SOUTH 

S K105 
H J64 
D Q105 
C A753 
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THIS IS HAND 2 
DEALER IS EAST 


NORTH 
S J 

H AKQJ95 
D J107 
C A83 


WEST 

S Q852 
H 10732 
D 83 
C J75 


EAST 

S AK974 
H 86 
D AK42 
C 96 


SOUTH 

S 1063 
H 4 
D Q965 
C KQ1042 




THIS IS HAND 3 
DEALER IS SOUTH 
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NORTH 

S J107 
H Q 
D A985 
C K7653 


WEST EAST 


S 954 
H 864 
D K7643 
C J10 


S AQ863 
H J10952 
D 10 
C AQ 


SOUTH 

S K2 
H AK73 
D QJ2 
C 9842 
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NORMAL TERMINATION AT LEVEL 0 
LAST STATEMENT EXECUTED WAS 82 


SNOBOL4 STATISTICS SUMMARY- 

2279 MS. COMPILATION TIME 
3295 MS. EXECUTION TIME 

3026 STATEMENTS EXECUTED, 142 FAILED 

1783 ARITHMETIC OPERATIONS PERFORMED 
516 PATTERN MATCHES PERFORMED 

0 REGENERATIONS OF DYNAMIC STORAGE 
0 READS PERFORMED 
69 WRITES PERFORMED 

1.09 MS. AVERAGE PER STATEMENT EXECUTED 
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Sample Program 2 


SN0B0L4 (VERSION 3.0, DEC. 17, 1969) 

BELL TELEPHONE LABORATORIES, INCORPORATED 




« 

« 

« 

* 

* 

* 

* 


THIS PROGRAM COMPUTES AND PRINTS A TABLE OF N FACTORIAL 
FOR VALUES OF N FROM 1 THROUGH AN UPPER LIMIT "NX". 


* 
* 
* 
« 

IT DEMONSTRATES A METHOD OF MANIPULATING NUMBERS WHICH ARE ♦ 
TOO LARGE FOR THE COMPUTER, AS STRINGS OF CHARACTERS. THE ♦ 
COMMAS IN THE PRINTED VALUES ARE OPTIONAL, ADDED FOR READING ♦ 


* 

EASE. 

* 


* 


* 


* * 

«**«*««**««*««**«*****«** 

*«*«***«« 


w 

« 

A 

INITIALIZATION. 



Ik 

NX = 35 


1 


N = 1 ; NSET = 1 


2 


NUM = ARRAY(1000) 


4 


NUM<1> = 1 


5 


FILL = ARRAY('0:3') 


6 

A 

FILL<0> = '000'; FILL<1> = '00'; FILL<2> 

= 'O' 

7 


OUTPUT = ' TABLE OF FACTORIALS FOR 1 

THROUGH ' NX 

10 

A 

OUTPUT = 


11 

* 

* 

COMPUTE THE NEXT VALUE FROM THE PREVIOUS ONE. 



LI 

I = 1 


12 

L2 

NUM<I> = NUM<I> ♦ N 

:F(ERR) 

13 


I = LT(I,NSET) I + 1 

:S(L2) 

14 


I = 1 


15 

L3 

LT(NUM<I>,1000) 

:S(L4) 

16 


NUMX = NUM<I> / 1000 

:F(ERR) 

17 


NUM<I + 1> = NUM<I + 1> + NUMX 

:F(ERR) 

18 


NUM<I> = NUM<I> - 1000 ♦ NUMX 

:F(ERR) 

19 

L4 

A 

I = LT(I,NSET) I + 1 

:S(L3) 

20 

* 

* 

FORM A STRING REPRESENTING THE FACTORIAL. 



♦ 

L5 

NSET = DIFFER(NUM<NSET + 1>) NSET + 1 


21 


NUMBER = NUM<NSET> 

:F(ERR) 

22 


I = GT(NSET,1) NSET - 1 

:F(L7) 

23 
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L6 NUMBER = NUMBER FILL<SIZE(NUM<I>)> NUM<I> 2^ 

I = GT(I,1) I - 1 ;S(L6) 25 

* 

♦ OUTPUT A LINE OF THE TABLE. 

* 

L7 OUTPUT = N '1=' NUMBER 26 

N = LT(N,NX) N + 1 :S(L1)F(END) 27 

* 

* ERROR TERMINATION. 

* 

ERR OUTPUT = N '! CANNOT BE COMPUTED BECAUSE OF TABLE OVERFLOW.' 28 

OUTPUT = ' INCREASE THE SIZE OF ARRAY "NUM".' 29 

* 

END 30 


NO ERRORS DETECTED IN SOURCE PROGRAM 
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TABLE OF FACTORIALS FOR 1 THROUGH 35 


1 ! = 1 
2!=2 
3!=6 
4!=24 
51=120 
61=720 
71=5,040 
81=40,320 
91=362,880 
101=3,628,800 
111=39,916,800 
121=479,001,600 
131=6,227,020,800 
141=87,178,291,200 
151=1,307,674,368,000 
161=20,922,789,888,000 
171=355,687,428,096,000 
181=6,402,373,705,728,000 
191=121,645,100,408,832,000 
201=2,432,902,008,176,640,000 
211=51,090,942,171,709,440,000 
221=1,124,000,727,777,607,680,000 
231=25,852,016,738,884,976,640,000 
241=620,448,401,733,239,439,360,000 
251=15,511,210,043,330,985,984,000,000 
261=403,291,461,126,605,635,584,000,000 
271=10,888,869,450,418,352,160,768,000,000 
281=304,888,344,611,713,860,501,504,000,000 
291=8,841,761,993,739,701,954,543,616,000,000 
301=265,252,859,812,191,058,636,308,480,000,000 
311=8,222,838,654,177,922,817,725,562,880,000,000 
321=263,130,836,933,693,530,167,218,012,160,000,000 
331=8,683,317,618,811,886,495,518,194,401,280,000,000 
341=295,232,799,039,604,140,847,618,609,643,520,000,000 
351=10,333,147,966,386,144,929,666,651,337,523,200,000,000 
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NORMAL TERMINATION AT LEVEL 0 
LAST STATEMENT EXECUTED WAS 27 


SNOBOL4 STATISTICS SUMMARY- 

1381 MS. COMPILATION TIME 
1747 MS. EXECUTION TIME 

1880 STATEMENTS EXECUTED, 274 FAILED 

1686 ARITHMETIC OPERATIONS PERFORMED 
0 PATTERN MATCHES PERFORMED 
0 REGENERATIONS OF DYNAMIC STORAGE 
0 READS PERFORMED 
37 WRITES PERFORMED 

0.93 MS. AVERAGE PER STATEMENT EXECUTED 
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Sample Program 3 


SN0B0L4 (VERSION 3.0, DEC. 8, 1969) 

BELL TELEPHONE LABORATORIES, INCORPORATED 


* 

* THIS PROGRAM USES PROGRAMMER-DEFINED DATA TYPES TO 

« REPRESENT AN ARBITRARILY LONG INTEGER AS A LINKED LIST 

« CALLED ALI. OPSYN IS USED TO DEFINE A BINARY OPERATOR 

* AND TWO UNARY OPERATORS FOR MANIPULATING ALIS. 

* 

* * APPENDS A NODE TO THE HEAD OF A LIST. # AND / RETURN 

* THE VALUE OF THE HEAD OF THE LIST, AND THE LIST LINKED 

* FROM THE HEAD, RESPECTIVELY. 

« 

* THE OPERATORS + AND ♦ ARE GENERALIZED TO RETURN INTEGERS 

* IF THE OPERANDS ARE INTEGERS AND THE RESULT LESS THAN 

* MAX (10000). IF THE RESULT IS GREATER THAN MAX, AN ALI 

« IS GENERATED WITH THE VALUE OF THE HEAD EQUAL TO THE LOW 

* ORDER DIGITS, AND THE LINK POINTING TO AN ALI WITH THE 

* HIGHER DIGITS. IF EITHER OPERAND IS AN ALI, THE RESULT 

« IS AN ALI. 

* 

♦ THE USE OF ALIS IS ILLUSTRATED BY COMPUTING THE FIRST K 


« POWERS OF AN INTEGER N. 

* 

* 

fiANCHOR = 1 1 

OPSYN('SUM','+',2) 2 

OPSYN('PROD','♦',2) 3 

DATA('ALI(V,L)') 4 

DEFINE('OUT(OUT)') 5 

DEFINE('APPEND(V,L)') 6 

DEFINE('ADD(II,12)C') 7 

DEFINE('MULdI ,I2)C') 8 

DEFINE('VAL(VAL)') 9 

DEFINE('LINK(I)') 10 

OPSYN('+','ADD',2) 11 

OPSYN('♦','MUL',2) 12 

OPSYN('*','APPEND',2) 13 

OPSYN('/','LINK',1) 14 

OPSYN('#','VAL',1) 15 

MAX = 10000 16 

ADDFIX = RTAB(SIZE(MAX) - 1) . C REM . ADD 17 

MULFIX = RTAB(SIZE(MAX) - 1) . C REM . MUL 18 




* 


FUNCTION DEFINITIONS 







:(FEND) 

19 

APPEND 

* 

APPEND 

= 

ALI(V,L) 

:(RETURN) 

20 

* 

ADD 

ADD 


IDENT(I2) 11 

:S(RETURN) 

21 


ADD 

= 

IDENT(II) 12 

:S(RETURN) 

22 


ADD 

= 

SUM(#I1,#I2) 


23 


LT(ADD 

,MAX) 

INTEGER(II) INTEGER(12) :S(RETURN) 

24 


ADD 

= 

LT(ADD,MAX) ADD 

% (/II + /I2) :S(RETURN) 

25 


ADD 

ADDFIX 


26 

* 

ADD 

= 

ADD % (C + (/II + 

/I2)) :(RETURN) 

27 

♦ 

LINK 

A 

LINK 

= 

-.INTEGER(I) L(I) 

:(RETURN) 

28 

VAL 

VAL 


-,INTEGER(VAL) V(VAL) : (RETURN) 

29 


* 

* 


* 

OUT 

OUT 

_ 

IDENT(/OUT) #OUT 

:S(RETURN) 

30 


OUT 

= 

OUT(/OUT) DUPL('O',SIZE(MAX) - SIZE(#OUT) - 1) 

31 

+ 

« 



#OUT 

:(RETURN) 

31 

* 

MUL 

MUL 

_ 

DIFFER(«I1) DIFFER(#I2) 

PRODdll ,«I2) :F(RETURN) 

32 


LT(MUL, 

MAX) 

INTEGER(II) INTEGER(12) 

:S(RETURN) 

33 


MUL 

= 

LT(MUL,MAX) MUL X ( 11 

♦ /I2 + 12 ♦ /II) 

34 

+ 




:S(RETURN) 

34 


MUL 

MULFIX 


35 


MUL 

= 

MUL X (C + 11 ♦ /I2 + 

12 ♦ /II) 

36 

+ 




:(RETURN) 

36 

FEND 





37 


N 

= 

256 


38 


K 

= 

25 


39 


P 

= 

1 


40 


OUTPUT 

= 

'POWERS OF ’ N 


41 


OUTPUT 

= 



42 

L 

I 

= 

LT(I,K) I + 1 

:F(END) 

43 


P 

= 

P ♦ N 


44 


OUTPUT 

= 

I ' : ’ OUT(P) 

: (L) 

45 

END 





46 


NO ERRORS DETECTED IN SOURCE PROGRAM 
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POWERS OF 256 

1: 256 

2: 65536 

3: 16777216 

4: 4294967296 

5; 1099511627776 

6; 281474976710656 

7; 72057594037927936 

8: 18446744073709551616 

9: 4722366482869645213696 

10: 1208925819614629174706176 

11: 309485009821345068724781056 

12: 79228162514264337593543950336 

13: 20282409603651670423947251286016 

14: 5192296858534827628530496329220096 

15: 1329227995784915872903807060280344576 

16: 340282366920938463463374607431768211456 

17: 87112285931760246646623899502532662132736 

18: 22300745198530623141535718272648361505980416 

19: 5708990770823839524233143877797980545530986496 

20: 1461501637330902918203684832716283019655932542976 

21: 374144419156711147060143317175368453031918731001856 

22: 95780971304118053647396689196894323976171195136475136 

23: 24519928653854221733733552434404946937899825954937634816 

24: 6277101735386680763835789423207666416102355444464034512896 

25: 1606938044258990275541962092341162602522202993782792835301376 
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NORMAL TERMINATION AT LEVEL 0 
LAST STATEMENT EXECUTED WAS 43 


SNOBOL4 STATISTICS SUMMARY- 

1514 MS. COMPILATION TIME 
5159 MS. EXECUTION TIME 

6511 STATEMENTS EXECUTED, 2782 FAILED 
762 ARITHMETIC OPERATIONS PERFORMED 
229 PATTERN MATCHES PERFORMED 

0 REGENERATIONS OF DYNAMIC STORAGE 
0 READS PERFORMED 
27 WRITES PERFORMED 

0.79 MS. AVERAGE PER STATEMENT EXECUTED 
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Sample Program 4 


SN0B0L4 (VERSION 3.0, DEC. 17, 1969) 

BELL TELEPHONE LABORATORIES, INCORPORATED 


♦ TOPOLOGICAL SORT 

* 

♦ MAPS A PARTIAL ORDERING OF OBJECTS INTO A LINEAR ORDERING 

« 

♦ A(1), A(2), ..., A(N) 

♦ 

♦ SUCH THAT IF A(S) < A(T) IN THE PARTIAL ORDERING,THEN S < T. 

♦ (CF. D.E.KNUTH, THE ART OF COMPUTER PROGRAMMING,VOLUME 1, 

♦ ADDISON-WESLEY,MASS.,1968, P.262) 

♦ 


6DUMP = 1 1 

6TRIM =1 2 

OUTPUT('OUT',6,’(121A1)') 3 

PAIR = BREAK('<') . MU LEN(1) BREAK(',') . NU LEN(1) 4 

DATA('ITEM(COUNT,TOP)') 5 

DATA('NODE(SUC,NEXT)') 6 

DEFINE('DECR(X)') 7 

* 

♦ SET UP A TABLE THAT ASSOCIATES WITH EACH OBJECT AN ITEM. 

X = TABLE(50) 8 


* 

♦ EACH ITEM HAS TWO FIELDS, (COUNT,TOP), WHERE 

♦ COUNT = NO. OF ELEMENTS PRECEEDING THE OBJECT. 

♦ TOP = TOP OF A LIST OF OBJECTS SUCCEEDING IT. 

♦ 

* READ IN RELATIONS. 

* 

♦ INITIALIZE THE ITEMS TO (0,NULL). 

« 


* FOR EACH RELATION, MU < NU, INCREASE THE COUNT OF PREDECESSORS 

* OF NU AND ADD A NODE TO THE LIST OF SUCCESSORS OF MU. 

* 


T1A 

OUT 

= '0 THE RELATIONS ARE:' 


9 

T2A 

REL 

= INPUT ',' 

:P(T3A) 

10 


OUTPUT 

= REL 


1 1 

T2 

REL 

PAIR = 

:P(T2A) 

12 


X<NU> 

= IDENT(X<NU>) ITEM(O) 


13 


COUNT(X<NU>) = COUNT(X<NU>) + 1 


14 


X<MU> 

= IDENT(X<MU>) ITEM(O) 


15 


TOP(X<MU>) 

= NODE(NU,TOP(X<MU>)) 

; (T2) 

16 


* 


♦ A QUEUE IS MAINTAINED OF THOSE ITEMS WITH ZERO COUNT FIELD. 
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♦ THE LINKS FOR THE QUEUE, QLINK, ARE KEPT IN THE COUNT FIELD. 

♦ THE VARIABLES F,R POINT TO THE FRONT AND REAR OF THE QUEUE. 

♦ 

T3A OPSYN('QLINK’,'COUNT’) 17 

Y = CONVERT(X,'ARRAY') 18 

PROTOTYPE(Y) BREAK(',') . N 19 

* 

* INITIALIZE THE QUEUE FOR OUTPUT. 



K 

= 0 


20 

T4 

K 

?Y<K + 1,1> K + 1 

:F(T8) 

21 


F = 

EQ(COUNT(Y<K,2>),0) Y<K 

,1> :F(T4) 

22 

T4A 

R = 

Y<K, 2> 


23 

T4B 

K 

?Y<K + 1,1> K + 1 

:F(T5) 

24 


QLINK(R) 

= EQ(C0UNT(Y<K,2>),0) 

Y<K,1> :F(T4B)S(T4A) 

25 

* 

OUTPUT THE FRONT OF THE QUEUE. 



T5 

OUT 

= '0 THE LINEAR ORDERING IS:' 

26 

T5A 

OUTPUT 

= DIFFER(X<F>) F 

:F(T8) 

27 


N 

= N - 1 


28 


P 

= TOP(X<F>) 


29 

* 

A 

ERASE RELATIONS. 



T6 

IDENT(P) 


:S(T7) 

30 


DECR(.COUNT ( X<SUC ( P )>)) 

:S(T6A) 

31 

* 

IF COUNT IS 

ZERO ADD ITEM TO QUEUE. 




QLINK(R) 

= sue( P ) 


32 


R 

= X<SUC(P)> 


33 

T6A 

3|{ 

P 

= NEXT(P) 

; (T6) 

34 

♦ 

A 

REMOVE FROM 

QUEUE. 



T7 

F 

= QLINK(X<F>) 

: (T5A) 

35 

* 

FUNCTION DEFINITIONS. 



DECR 

$x 

= GT($X,1 ) $X - 1 

:S(RETURN) 

36 


$x 

= 0 

: (FRETURN) 

37 

T8 

OUTPUT 

= NE(N,0) 'THE ORDERING 

CONTAINS A LOOP.' 

38 

END 




39 


NO ERRORS DETECTED IN SOURCE PROGRAM 
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THE RELATIONS ARE: 

LETTERS<ALPHANUM,NUMBERS<ALPHANUM, 

BLANKS<OPTBLANKS, 

NUMBERS<REAL, 

NUMBERS<INTEGER, 

LETTERS<VARIABLE,ALPHANUM<VARIABLE, 

BINARY<BINARYOP,BLANKS<BINARYOP, 

UNQALPHABET<DLITERAL, 

UNQALPHABET<SLITERAL, 

SLITERAL<LITERAL,DLITERAL<LITERAL,INTEGER<LITERAL,REAL<LITERAL, 

THE LINEAR ORDERING IS: 

LETTERS 

NUMBERS 

BLANKS 

BINARY 

UNQALPHABET 

INTEGER 

REAL 

ALPHANUM 

OPTBLANKS 

BINARYOP 

SLITERAL 

DLITERAL 

VARIABLE 

LITERAL 
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NORMAL TERMINATION AT LEVEL 0 
LAST STATEMENT EXECUTED WAS 38 


DUMP OF VARIABLES AT TERMINATION 

NATURAL VARIABLES 

ABORT = PATTERN 
ARB = PATTERN 
BAL = PATTERN 
F = 0 

FAIL = PATTERN 
FENCE = PATTERN 

INPUT = 'SLITERAL<LITERAL,DLITERAL<LITERAL,INTEGER<LITERAL,REAL<LITERAL' 
K = 14 
MU = 'REAL' 

N = 0 

NU = 'LITERAL' 

OUT = '0 THE LINEAR ORDERING:' 

OUTPUT = 'LITERAL' 

PAIR = PATTERN 
R = ITEM 
REM = PATTERN 
SUCCEED = PATTERN 
X = TABLE(50,10) 

Y = ARRAY('14,2') 


UNPROTECTED KEYWORDS 

SABEND = 0 
gANCHOR = 0 
gCODE = 0 
gDUMP = 1 
gERRLIMIT = 0 
SFTRACE = 0 
gFULLSCAN = 0 
gINPUT = 1 
gMAXLNGTH = 5000 
gOUTPUT = 1 
gSTLIMIT = 50000 
gTRACE = 0 
gTRIM = 1 
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SN0B0L4 STATISTICS SUMMARY- 

1847 MS. COMPILATION TIME 
349 MS. EXECUTION TIME 

310 STATEMENTS EXECUTED, 71 FAILED 

64 ARITHMETIC OPERATIONS PERFORMED 
25 PATTERN MATCHES PERFORMED 
0 REGENERATIONS OF DYNAMIC STORAGE 
10 READS PERFORMED 
25 WRITES PERFORMED 

1.13 MS. AVERAGE PER STATEMENT EXECUTED 
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Sample Program 5 


SN0B0L4 (VERSION 3.0, DEC. 17, 1969) 

BELL TELEPHONE LABORATORIES, INCORPORATED 


I I 

I This program is a syntactic recognizer for SNOBOL4 statements. | 

I The extended graphics for the IBM 360 are used in the program | 

I and in the data. | 

I_I 

1 

I ^ I 

I A series of patterns is built, culminating in a pattern that | 

I matches SNOBOL4 statements. | 

I_I 

2 


I- 

I Card images are then read in and processed. 

I_ 


1 

I 

J 


3 

DEFINE('opt(pattern)') 4 

OPSYN('t','opt’,1) 5 

OPSyN('X','SPAN',1) 6 

OPSyN( '#’,'ANY',1) 7 

leftbr -#'<[' 8 

rightbr *-#’>]' 9 

letters 'ABCDEFGHIJKLNMOPQRSTUVWXYZabcdefghijklmnopqrstuvw’ 10 

+ 'xyz' 10 

digits - '0123456789' 11 

alphanumerics letters digits 12 

blanks ■>- %' ' 13 

integer Xdigits 14 

real integer '.' tinteger 15 

identifier #letters tX(alphanumerics ' ) 16 

opsyms '-i?$ . ! X*/#+-a 16' 17 

unary #opsyms 18 

binarysyms #opsyms | 19 

unaryop Xopsyms 20 

binaryop <- blanks t (binarysyms blanks) 21 

unqalpha SALPHABET 22 

unqalpha • _ 23 

unqalpha ^ 24 

dliteral - "" tX(unqalpha 25 

sliteral - "'" tX (unqalpha 26 

literal sliteral | dliteral | integer | real 27 

element tunaryop (identifier | literal | ♦function_call | 28 

+ '(' ’•‘expression ')' | ’•‘array_ref) 28 
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operation *element binaryop (■•‘element | ■•‘expression) 29 

expression <- tblanks (■•‘element | ■•‘operation | null) tblanks 30 

arglist *- ■•‘expression t(',' ■•‘arglist) 31 

function_call identifier ' ( ' ■•‘arglist ' ) ' 32 

array_ref - identifier leftbr ■•‘arglist rightbr 33 

label - #alpha]|iumerics (BREAK(' ;') | REM) 34 

label_field - tlabel 35 

goto *- ' ( ' expression ' )' | leftbr expression rightbr 36 

goto_field - t(blanks FENCE tblanks (goto | 'S' goto 37 

+ I 'P' goto I 'S' goto tblanks 'F' goto | 'F' goto 37 

-I- tblanks 'S' goto) tblanks) 37 

eql 38 

rule - t(blanks element (blanks eql t(blanks expression) | 39 

■f t (blanks expression t (blanks eql t (blanks expression)) 39 

+ ))) 39 

eos - RPOS(O) I ' j ' 40 

statement label_field rule goto_field eos 41 

42 


I I 
I The recognizing program follows. Comment and control cards | 
I are not processed. If an erroneous statement is encountered | 
I in a string of statements separated by semicolons, subsequent | 

I statements in that string are not processed. | 
I_I 






43 


INPUT(. 

input,5,72) 


44 


SANCHOR 

- 1 


45 


STRIM 

- 1 


46 


6PULLSCAN - 1 


47 


eof - 



48 


space 

- DUPL(' ',5) 


49 


line 



50 


comment 

- r'-’ 


51 


continue - #'+.' . cc 


52 

readi 

image 

- input 

: F(END) 

53 


OUTPUT 

«- space line 


54 


image 

comment 

:F(readc)S(readi) 

55 

nextst 

IDENT(eof) 

:F(END) 

56 


OUTPUT 

<- space line 


57 


image 

line 


58 

readc 

line 

- input 

:F(endgam) 

59 


line 

comment 

:S(print) 

60 


line 

continue 

:F(anlyz) 

61 


OUTPUT 

- space cc line 


62 


image 

- image line 

:(readc) 

63 

anlyz 

image 

statement ♦- 

;P(error) 

64 


DIFFER(image) 

:S(anlyz) 

65 

+ 



F(nextst) 

65 



228 Sample Programs 


error 

OUTPUT '«< syntactic error »>' 

:(nextst) 

66 

print 

OUTPUT - space line 

:(readc) 

67 

endgam 

eof 1 

(anlyz) 

68 

cpt 

opt null 1 pattern 

:(RETURN) 

69 

END 



70 


NO ERRORS DETECTED IN SOURCE PROGRAM 


X - Y+Z 

«< syntactic error »> 

ELEMENT<I,J>- ELEMENT<I,-J> + ELEMENT<-I,J> 

«< syntactic error »> 

A<X,Y,Z + 1> F(X, STRUCTURE. BUILD (TYPE, LENGTH + 1)) 

SETUP PAT1 - (BREAKC,:') $ FIRST | SPAN( ' .') $ SECOND 
-I- . VALUE ARBNO(BAL | LEN( 1 ) ) :($SWITCH) 

DEFINE('F(X,Y)) 

«< syntactic error »> 

L - LT(N,B<J> L + 1 

«< syntactic error »> 

NEWONE.TRIAL X - -iCOORD<1 ,K> X ♦ X 

TRIM(INPUT) PAT1 :S(OK) ;P(BAD) 

«< syntactic error »> 

X 3.01fY=2. jZ = X*-Y 

RANDOM (RAN.VAR ♦ N) / 10000 

LOOP OUTPUT = LT(N,10) :S(LOP 

<« 'syntactic error »> 

LOOP OUTPUT = LT(N,10) :S(LOOP) 

OUTPUT = X ;(PAST) 

?M<X> :S(A1)P(A2) 

OUTPUT 
F( 

F( 

«< syntactic error »> 


II I 
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NORMAL TERMINATION AT LEVEL 0 
LAST STATEMENT EXECUTED WAS 56 


SNOBOL4 STATISTICS SUMMARY- 

1930 MS. COMPILATION TIME 
3594 MS. EXECUTION TIME 
232 STATEMENTS EXECUTED, 56 FAILED 

0 ARITHMETIC OPERATIONS PERFORMED 
63 PATTERN MATCHES PERFORMED 
0 REGENERATIONS OF DYNAMIC STORAGE 
24 READS PERFORMED 
30 WRITES PERFORMED 

15.49 MS. AVERAGE PER STATEMENT EXECUTED 




230 Sample Programs 


Sample Program 6 

SN0B0L4 (VERSION 3.0, DEC. 17, 1969) 

BELL TELEPHONE LABORATORIES, INCORPORATED 

♦ WHEN THE OUTPUT ASSOCIATION FOR "SING" IS CHANGED TO 

♦ A DIGITAL-TO-ANALOG CONVERTER WITH THE PROPER MELODY 

♦ SYNTHESIZER, THIS PROGRAM SINGS THAT OLD CHRISTMASTIME 

♦ FAVORITE, "A PARTRIDGE IN A PEAR TREE." 

« 

♦ M. D. SHAPIRO 

ACAPPELLA.CHOIR = 6 OR MORE PEOPLE SINGING IN TUNE 1 

DAYS = 'FIRST,SECOND,THIRD,FOURTH,FIFTH,SIXTH,' 2 

+ 'SEVENTH,EIGHTH,NINTH,TENTH,ELEVENTH,TWELFTH,' 2 

NEXT = BREAK(',') . WHICH LEN(1) 3 

♦ 

TRACE('SING','VALUE',,'SONG') 4 

6TRACE = 1000 5 

DEFINE('SONG()') :(NEXT.DAY) 6 

SONG PAUSE IDENT(SING) OUTPUT('SING', ACAPPELLA.CHOIR, 7 

+ H pause "',100A1)") = ' ' ; (RETURN) 7 

* 

NEXT.DAY DAYS NEXT = ;F(CODA) 8 

SING = (TAKE A BREATH) 9 

SING = 'ON THE ' WHICH ' DAY OF CHRISTMAS,' 10 

SING = 'MY TRUE LOVE GAVE TO ME,' :($WHICH) 11 

TWELFTH SING = 'TWELVE LORDS A-LEAPING,' 12 

ELEVENTH SING = 'ELEVEN LADIES DANCING,' 13 

TENTH SING = 'TEN PIPERS PIPING,' 14 

NINTH SING = 'NINE DRUMMERS DRUMMING,' 15 

EIGHTH SING = 'EIGHT MAIDS A-MILKING,' 16 

SEVENTH SING = 'SEVEN SWANS A-SWIMMING,' 17 

SIXTH SING = 'SIX GEESE A-LAYING,' 18 

FIFTH SING = 'FIVE GOLD RINGS,' 19 

FOURTH SING = 'FOUR COLLY BIRDS,' 20 

THIRD SING = 'THREE FRENCH HENS,' 21 

SECOND SING = 'TWO TURTLEDOVES,' 22 

FIRST SING = AND 'A PARTRIDGE IN A PEAR TREE.' 23 

AND = IDENT(AND) 'AND ' ;(NEXT.DAY) 24 

CODA SING = INPUT ;S(CODA) 25 

* 

END 26 


NO ERRORS DETECTED IN SOURCE PROGRAM 
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ON THE FIRST DAY OF CHRISTMAS, 

MY TRUE LOVE GAVE TO ME, 

A PARTRIDGE IN A PEAR TREE. 

ON THE SECOND DAY OF CHRISTMAS, 

MY TRUE LOVE GAVE TO ME, 

TWO TURTLEDOVES, 

AND A PARTRIDGE IN A PEAR TREE. 

ON THE THIRD DAY OF CHRISTMAS, 

MY TRUE LOVE GAVE TO ME, 

THREE FRENCH HENS, 

TWO TURTLEDOVES, 

AND A PARTRIDGE IN A PEAR TREE. 

ON THE FOURTH DAY OF CHRISTMAS, 

MY TRUE LOVE GAVE TO ME, 

FOUR COLLY BIRDS, 

THREE FRENCH HENS, 

TWO TURTLEDOVES, 

AND A PARTRIDGE IN A PEAR TREE. 

ON THE FIFTH DAY OF CHRISTMAS, 
MY TRUE LOVE GAVE TO ME, 

FIVE GOLD RINGS, 

FOUR COLLY BIRDS, 

THREE FRENCH HENS, 

TWO TURTLEDOVES, 

AND A PARTRIDGE IN A PEAR TREE. 


ON THE SIXTH DAY OF CHRISTMAS, 
MY TRUE LOVE GAVE TO ME, 

SIX GEESE A-LAYING, 

FIVE GOLD RINGS, 

FOUR COLLY BIRDS, 

THREE FRENCH HENS, 

TWO TURTLEDOVES, 

AND A PARTRIDGE IN A PEAR TREE. 


ON THE SEVENTH DAY OF CHRISTMAS, 
MY TRUE LOVE GAVE TO ME, 

SEVEN SWANS A-SWIMMING, 

SIX GEESE A-LAYING, 

FIVE GOLD RINGS, 

FOUR COLLY BIRDS, 

THREE FRENCH HENS, 

TWO TURTLEDOVES, 

AND A PARTRIDGE IN A PEAR TREE. 


231 
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ON THE EIGHTH DAY OF CHRISTMAS, 

MY TRUE LOVE GAVE TO ME, 

EIGHT MAIDS A-MILKING, 

SEVEN SWANS A-SWIMMING, 

SIX GEESE A-LAYING, 

FIVE GOLD RINGS, 

FOUR COLLY BIRDS, 

THREE FRENCH HENS, 

TWO TURTLEDOVES, 

AND A PARTRIDGE IN A PEAR TREE. 

ON THE N^INTH DAY OF CHRISTMAS, 

MY TRUE LOVE GAVE TO ME, 

NINE DRUMMERS DRUMMING, 

EIGHT MAIDS A-MILKING, 

SEVEN SWANS A-SWIMMING, 

SIX GEESE A-LAYING, 

FIVE GOLD RINGS, 

FOUR COLLY BIRDS, 

THREE FRENCH HENS, 

TWO TURTLEDOVES, 

AND A PARTRIDGE IN A PEAR TREE. 

ON THE TENTH DAY OF CHRISTMAS, 

MY TRUE LOVE GAVE TO ME, 

TEN PIPERS PIPING, 

NINE DRUMMERS DRUMMING, 

EIGHT MAIDS A-MILKING, 

SEVEN SWANS A-SWIMMING, 

SIX GEESE A-LAYING, 

FIVE GOLD RINGS, 

FOUR COLLY BIRDS, 

THREE FRENCH HENS, 

TWO TURTLEDOVES, 

AND A PARTRIDGE IN A PEAR TREE. 

ON THE ELEVENTH DAY OF CHRISTMAS, 
MY TRUE LOVE GAVE TO ME, 

ELEVEN LADIES DANCING, 

TEN PIPERS PIPING, 

NINE DRUMMERS DRUMMING, 

EIGHT MAIDS A-MILKING, 

SEVEN SWANS A-SWIMMING, 

SIX GEESE A-LAYING, 
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FIVE GOLD RINGS, 

FOUR COLLY BIRDS, 

THREE FRENCH HENS, 

TWO TURTLEDOVES, 

AND A PARTRIDGE IN A PEAR TREE. 

ON THE TWELFTH DAY OF CHRISTMAS, 
MY TRUE LOVE GAVE TO ME, 

TWELVE LORDS A-LEAPING, 

ELEVEN LADIES DANCING, 

TEN PIPERS PIPING, 

NINE DRUMMERS DRUMMING, 

EIGHT MAIDS A-MILKING, 

SEVEN SWANS A-SWIMMING, 

SIX GEESE A-LAYING, 

FIVE GOLD RINGS, 

FOUR COLLY BIRDS, 

THREE FRENCH HENS, 

TWO TURTLEDOVES, 

AND A PARTRIDGE IN A PEAR TREE. 

« 

***«« 

4c )|t 

)|c 4i 41 4i 4e 4t 4c 4c 4c 

4c4t4c4c4c4<4c4t4t4t4< 
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NORMAL TERMINATION AT LEVEL 0 
LAST STATEMENT EXECUTED WAS 25 


SNOBOL4 STATISTICS SUMMARY- 

1149 MS. COMPILATION TIME 
565 MS. EXECUTION TIME 

276 STATEMENTS EXECUTED, 123 FAILED 

0 ARITHMETIC OPERATIONS PERFORMED 
25 PATTERN MATCHES PERFORMED 
0 REGENERATIONS OF DYNAMIC STORAGE 
9 READS PERFORMED 
121 WRITES PERFORMED 
2.05 MS. AVERAGE PER STATEMENT EXECUTED 
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Solutions to Exercises 


Solution 1.1 

R OUTPUT 

OUTPUT 
PUNCH 

END 


= INPUT 
= OUTPUT 


F(END) 

S(R) 

(R) 


Solution 1.2 


FRONT = LEN(8) 

R X = INPUT 

X FRONT = 

OUTPUT = X 

END 


: F(END) 
: (R) 


Solution 1.3 



E 

POWER 


2.71828 

1 



OUTPUT 

= 

’POWERS 

OF E: 

LOOP 

POWER 

OUTPUT 


POWER ♦ 

POWER 

E 


SUM 

= 

SUM + 

POWER 


I 

= 

LT(I,9) 

I + 

END 

OUTPUT 


'SUM: ' 

SUM 


:S(LOOP) 


235 
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Solution 1.4 



6TRIM 

= 1 


IN 

'ABCDEFGHIJKLMNOPQRSTUVWXYZ, . ; ; 1 ?0 123456789+-*/aJt 


CODE 

= '0123456789+-*/aX ABCDEFGHIJKLMNOPQRSTUVWXYZ 

READ 

END 

OUTPUT 

= replace!INPUT,IN,CODE) ;S(READ) 


Solution 

13 

6TRIM 

A 

I 

= 

1 

ARRAY(INPUT) 

1 


READ 

A<I> 

= 

INPUT 

;F(PRINT) 


I 

= 

I + 1 

:(READ) 

PRINT 

I 

= 

I - 1 


END 

OUTPUT 


A<I> 

;S(PRINT) 


Solution 1.6 


CH = LEN(1) . X 

HEX = TABLE(16) 

HEX<'0’> = 0; HEX<'1'> = 1; 
HEX<'4'> = 4; HEX<'5'> = 5; 
HEX<'8'> = 8; HEX<'9'> = 9; 

HEX<'2'> = 2; HEX<'3’> = 3 
HEX<'6'> = 6; HEX<'7'> = 7 
HEX<'A'> = 10; HEX<'B'> = 


HEX<'C'> = 12; HEX<'D'> = 13; HEX<'E'> = 14; HEX<'P'> = 15 

* 

DEFINE('H(n)x') 


H N CH = :F(RETURN) 

H = DIFFER(HEX<X>) 16 ♦ H + HEX<X> 

+ :F(FRETURN)S(H) 

END 


Solution 1.7 

PARTA DATA('BNODE(VALUE,LSON,RSON)') 

PARTB N = BNODE('+',BNODE('X'),BNODE('Y')) 

PARTC DEPINE('LLIST(N)') 
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LLIST OUTPUT = DIPFER(N) VALUE(N) :F(RETURN) 

LLIST(LSON(N)) 

LLIST(RSON(N)) :(RETURN) 

END 


Solution 2.1 

6TRIM = 1 

COUNT = 0 

PAT = POS(O) LEN(*N) BREAK('.,:j?I()"' " ) LEN(1) SN 


READ 

TEXT 

= INPUT 

:F(PRINT) 


OUTPUT 

= TEXT 



N 

= 0 


C 

TEXT 

PAT 

:F(READ) 


COUNT 

= COUNT + 1 

: (C) 

PRINT 

OUTPUT 

= 



OUTPUT 

= 'THERE ARE 

' COUNT ' PUNCTUATION MARKS 


END 


Solution 22 


( 'C' I 'CR') $ OUTPUT ('O' I '00' I 
('P' I 'PS') $ OUTPUT RPOS(0) 


' EE') $ OUTPUT 


Solution 23 



£TRIM 

= 

1 


WORD 

= 

BREAK(' ,.:;?') . W LEN(1) 


N 

= 

INPUT 


OUTPUT 

= 

' WORDS OF LENGTH ' N ' ARE:' 

READ 

TEXT 

= 

INPUT ' ' :F(END) 

NEXTW 

TEXT 

WORD = :F(READ) 

END 

OUTPUT 


EQ(SIZE(W),N) W :(NEXTW) 


This program assumes the word length is given left justified on the first input card. Text 
on subsequent cards is assumed to be broken between words. 
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Solution 2.4 


LETTER = 'ABCDEPGHIJKLMNOPQRSTUVWXYZ' 

DIGIT = '0123456789' 

LABEL = ' ' I ANY(LETTER) (SPAN(LETTER DIGIT) 9N ♦LE(N,8) 


+ 

eANCHOR 

1 NULL) ' ' 

= 1 


R 

CARD 

= INPUT 

:F(END) 


CARD 

LABEL 

:S(R) 

END 

OUTPUT 

= CARD 

: (R) 


Solution 2.5 





QUOTE 

= 

anyC" •"") 



LITERAL 


(QUOTE $ Q BREAK(*Q) 

LEN(1) ) 

Solution 2.6 





STRIM 

= 

1 ; 6FULLSCAN 

= 1 


ABC 

POS(O) (SPAN('A') 3N SPAN('B') 

+ 


POS(*(2 ♦ N)) RPOS(*N) 

SPAN( 'C ) RPOS(O) ) 

READ 

CARD 

= 

INPUT :F(END) 



CARD 

ABC 

;P(NO) 



OUTPUT 

= 

"" CARD '" MATCHES. 

' ;(READ) 

NO 

OUTPUT 

= 

"" CARD '" DOES NOT 

MATCH.' :(READ) 

END 






Solution 2.7 


+ 


REP 


POS(O) (LEN(1) ARB) $ X (RPOS(O) ABORT 
I ARBNO(*X) RPOS(O)) 


Solution 2.8 


HOL 


POS(O) BREAK('H') $ N LEN(*(N +1)) RPOS(0) 
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Solution 2.9 



&FULLSCAN 

= 

= 1 ; £TRIM 

= 1 



PALIN 

= 

LEN(1) $ 

H i 

ax («GT(2 * X,SIZE(S)) 

I 

+ 



*RTAB(X) 

ay 

az (*H az I «EQ(Y,Z) 

ABORT) 

READ 

S 

= 

INPUT 


:F(END) 



S PALIN 



:F(NO) 



OUTPUT 

= 

1 n 1 g 1 II 

IS 

A PALINDROME.' :(READ) 

NO 

OUTPUT 

= 

1 II 1 g 1 II 

IS 

NOT A PALINDROME.' : 

(READ) 

END 







Solution 3.1 







B2 

= 

1 1 





FIELDS 

= 

LEN(7) . : 

LAB 

LEN(8) . OP LEN(20) 

. OPER 

+ 



LEN(41) 

. COM 


R 

INPUT 

FIELDS 


:F(END) 



OUTPUT 

= 

LAB B2 OP 

B2 

OPER COM :(R) 



END 


Solution 3.2 



£TRIM 

= 1j £ANCHOR 

= 1 


BB 

= BREAK(' ' 

); SB = SPAN( ' ') 


FIELDS 

= BB . FI 

SB BB . F2 SB (BB . F3 SB REM . 

+ 


1 REM . 

F3 NULL . P4) 

READ 

CARD 

= INPUT 

:F(END) 


CARD 

FIELDS 

:F(ERROR) 


IMAGE 

= FI DUPLC 

',9 - SIZE(PI)) F2 DUPLC ', 

+ 


10 - SIZE(F2)) F3 DUPLC ',20 - SIZE(F3) 

+ 



:F(ERROR) 


OUTPUT 

= LE(SIZE(IMAGE),80) IMAGE :F(ERR0R) 


PUNCH 

= OUTPUT 

:(READ) 

ERROR 

OUTPUT 

= FORMAT ERROR IN; ' CARD ;(READ) 


P4 


F4 


END 


Solution 33 


STRIM 

BB 

QL 

FIELDS 


SANCHOR = 1 

BREAK(' ')? SB = SPAN(' ') 

ANYC"") BREAKC"") LEN(1) 

BB . F1 SB BB . F2 SB ((QL | BB) . F3 SB REM 
I REM . F3 NULL . F4) 
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READ 


+ 

+ 


ERROR 

END 


COMMENT 

CARD 

CARD 

CARD 

IMAGE 


OUTPUT 

PUNCH 

OUTPUT 


= ( 

= INPUT 
COMMENT 
FIELDS 
F1 DUPL( 


REM) . OUTPUT 
F(END) 

S(READ) 

F(ERROR) 

,9 - SIZE(F1)) 

10 - SIZE(F2)) F3 DUPL( 

:F(ERROR) 

LE(SIZE(IMAGE),80) IMAGE 
OUTPUT ;(READ) 

’♦♦♦ FORMAT ERROR INJ ' CARD 


F2 DUPL(' ’, 

',20 - SIZE(F3)) F4 


:F(ERROR) 

: (READ) 


Solution 3.4 

STRIM 

READ X 

MAX 

PRINT OUTPUT 

END 


= 1 
INPUT 

= GT(X,MAX) 
= 'MAXIMUM; 


;P(PRINT) 
X :(READ) 

' MAX 


Solution 3.5 



STRIM 

= 

1 


READ 

N 

= 

INPUT :F(END) 


FACT 

= 

INTEGER(N) GT(N) 

1 :F(ERROR) 


I 

= 

1 


L 

I 

= 

LT(I,N) I + 1 

;F(PRINT) 


FACT. 

= 

FACT ♦ I :(L) 


PRINT 

OUTPUT 

= 

N '! = ' FACT 

:(READ) 

ERROR 

OUTPUT 

= 

’♦♦♦♦ ERROR ; ’ 

N ' NOT A POSITIVE 


+ 

END 


:(READ) 


Solution 3.6 



STRIM 

= 

1 


READ 

X 

= 

INPUT ;F(END) 



COUNT 

= 

1 



Y 

= 

INPUT ;F(END) 



Z 

= 

REPLACE(X,X,Y) 


C 

COUNT 

= 

DIFFER(Z,X) COUNT + 1 

;P(PRINT) 


Z 

= 

REPLACE(Z,X,Y) :(C) 


PRINT 

OUTPUT 

= 

’THE PERMUTATION (' X ', 

’ Y ') HAS 


INDEX 
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This program assumes the members of the permutation are given left justified on 
successive cards. 


Solution 3.7 

READ INPUT LEN(7) . OUTPUT :P(END) 

OUTPUT = REPLACE('ABCDEPG','GPEDCBA',OUTPUT) 
OUTPUT = ;(READ) 

END 


Solution 3.8 

OPSYNl'#','SIZE',1) 

Solution 3.9 

OPSYNi,2) 

Solution 4.1 

DEPINE('SUBSTR(S,N) ' ) 

SUBSTR S LEN(N) $ OUTPUT RPOS(O) :(RETURN) 

This function requires the unanchored mode. 

Solution 42 

DEPINE('P(N)') 

EQ(N) 0 :S(RETURN) 

EQ(N,1) 1 :S(RETURN) 

P(N - 1) + P(N - 2) :(RETURN) 
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Solution 43 


+ 

+ 


6ANCH0R = 1 

EXPAT = BAL . X ' ! ' .OP REM . Y 
PMPAT = (ARBNO(BAL ANY('+-')) $ X FAIL | 
TAB(*(SIZE(X) - 1))) . X LEN(1) 
MDPAT = (ARBNO(BAL ANY('♦/')) $ X PAIL | 
TAB(*(SIZE(X) - 1))) . X LEN(1) 
STRIP = '(' BAL . POL ')' RPOS(O) 

DEFINE('POL(POL)X,Y,OP') 


♦DIFFER(X) 

. OP REM . 
♦DIFPER(X) 

. OP REM . 


Y 

Y 


POL 

POL 

STRIP 



:S(POL) 

+ 

POL 

PMPAT 

= OP 

’ ( ’ 

POL(X) POL(Y) ')' 

:S(RETURN) 

+ 

POL 

MDPAT 

= OP 

' ( ' 

POL(X) POL(Y) ')' 

:S(RETURN) 

+ 

POL 

EXPAT 

= OP 

' ( ’ 

POL(X) POL(Y) ')' 

:(RETURN) 


Solution 4.4 

DEFINE('NOT(P)') 
SPULLSCAN = 1 

DEFINE('PLAG()') 


FLAG FLAG. = 1 ;(RETURN) 

NOT NOT = NULL $ FLAG. P ♦PLAG() PAIL 

+ I ♦IDENTCPLAG.) :(RETURN) 


Solution 5.1 


£PULLSCAN = 

1 


S/^CHOR = 

1 


£TRIM = 1 



DEFINE('INCR()') 


G = 

ARRAY(INPUT ',2') 


NUMBER 

BREAK(' ') . N SPAN(' 

' ) 

NODE 

BREAK(' ') .X SPANC 

’ ) 

COUNT = 

SCT BK SUCCEED TAB(*K) 

BREAK(' ') *INCR() 


SPANC ') ax RPOS(O) 
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>K 


R 

LINE 

= 

INPUT ' ' 

:F{GO) 


OUTPUT 

= 

LINE 



LINE 

NUMBER = 



LINE 

COUNT 



G<N,1> 

= 

ARRAY(CT) 



I 

= 

0 


S 

I = I 

+ 

1 



LINE 

NODE = 

:F(R) 


ITEM(G<N, 

1>, 

I) = X 

:S(S)F(ERR) 

GO 

N 

= 

0 



BB 

= 

DUPL(' ',5) 


G01 

N 

= 

N + 1 



X 

= 

G<N,1> 

:F(END) 


OUTPUT 

= 

N 



I 

= 

DIFFER(G<N, 

1>) 0 :F(G01) 

G02 

I 

= 

I + 1 



OUTPUT 

= 

BB X<I> 

:S(G02)F(G01) 

INCR 

END 

CT 


CT + 1 

:(RETURN) 


The pattern COUNT counts the number of fields separated by blanks. Transfer to the 
undefined label ERR causes program termination with an error message. 


Solution 5^ 

P define!'PATH(I,J)K') DEFINE('PATHIU,J)S,L,K') 


PATH 

K 

G<K,2> 

— 

K + 1 

:S(PATH) 


PATH 

= 

PATH1(I,J) 

:S(RETURN)F(FRETURN) 

PATH1 

S 

= 

DIFFER(G<I 

,1>) G<I,1> ;F(FRETURN) 


G<I,2> 

= 

IDENT(G<I, 

2>) 1 :F(FRETURN) 

PATH_T 

L 

= 

L + 1 



K 

= 

S<L> 

:F(PATH_D) 


PATH1 

= 

EQ(K,J) I 

J ;F(PATH_T)S(RETURN) 

PATH_D 

L 

= 

0 


PATH_A 

L 

= 

L + 1 



K 

= 

S<L> 

:F(FRETURN) 


PATH1 = I PATHKK.J) :S (return )F(PATH_A) 

These functions supplement Solution 5.1. 
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Solution 



SDUMP 

= 

1 



&ANCHOR 

= 

1 



ALPHA 

= 

any!'abcdefghijklmnopqrstuvwxyz') 



ALPHANUM 

= 

any!'ABCDEFGHIJKLMNOPQRSTUVWXYZO123456789') 


LABEL 

= 

(ALPHA ARBNO!ALPHANUM)) . L SPAN!' ') 



output!'eject',,'!1H1)') 



DEFINED 

= 

BREAK!':') 



FLOW 

= 

table!10,10) 



BRANCH 

= 

(span!' ') 1 NULL) 'BRANCH' SPAN!' ') 


+ 



break!' ') . L 


R 

LINE 

= 

INPUT :F!LIST) 



OUTPUT 

= 

LINE 



STNO 

= 

STNO + 1 



LINE 

LABEL = ;F!BRANCH) 



PLOW<L> 

DEFINED :S(MULTI) 



FLOW<L> 

= 

STNO ':' FLOW<L> 


BRANCH 

LINE 

BRANCH :F!R) 



FLOW<L> 

= 

FLOW<L> STNO ',' :!R) 


MULTI 

OUTPUT 

= 

•****« error-multidefined label ♦♦♦♦♦' 

: (R) 

LIST 

LIST 

= 

CONVERT(FLOW,'ARRAY') 



EJECT 

= 




OUTPUT = 

DUPLC ',20) 'FLOW TABLE' 



OUTPUT 

= 




I 

= 

0 


LOOP 

I 

= 

I + 1 



X 

= 

LIST<I,1> :F!END) 



LIST<I,2> 

DEFINED ;S!LIST1) 


OUTPUT 

= '♦♦♦ ' 

LIST<I,1> ' IS UNDEFINED LABEL:' 


+ 



LIST<I,2> :!L00P) 


LIST1 

OUTPUT 

= 

X dupl!' ',15 - size!x)) LIST<I,2> 

:S!LOOP) 


OUTPUT 

= 

X dupl!' ',5) LIST<I,2> :!L00P) 


END 






In the flow table, each label is given with the defining statement number, followed by a 
list of statement numbers containing branches to the label. 


Solution 5.4 


DATA('C0MPLEX(R,I)') 

define!'MUL(X,Y)') 

OPSYN('X',,2) 

OPSYN(,'MUL' ,2) 

:(END) 

MUL MUL = DIFFER(DATATYPE(X),'COMPLEX') 

+ DIFFER(DATATYPE(Y),'COMPLEX') 
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+ 


MUL 


+ 


MUL 


+ 


MUL 


+ 

OUT MUL 


X % Y :S(RETURN) 

= DIFFER(DATATYPE(X),’COMPLEX') 

COMPLEX(X % R(Y),X * l(Y)) :S(OUT) 
= DIFFER(DATATYPE(Y),'COMPLEX') 

COMPLEX(Y ♦ R(X),Y ♦ l(X)) :S(OUT) 

= COMPLEX(R(X) X R(Y) - l(X) % l(Y), 

R(X) % I{Y) + R(Y) % I(X)) 

= EQ(I(MUL)) R(MUL) :(RETURN) 


An example of the use of this function would be the execution of 


the following statements: 


A = complex!5,3) 
B = complex!6,2) 
PR = A ♦ B 

PR = 4 ♦ A 

PR = B ♦ 5 


Solution 6.1 

define!'push!a)') 
define!'pop(a)') 
define!'top(a)') 

OPSYN! , 'PUSH',1) 
OPSYN!'#','POP',1) 
OPSYN!'!','TOP',1) 
6TRIM = 1 


PUSH A<0> = A<0> + 1 
PUSH = .A<A<0» 

* 

POP POP = A<A<0» 

A<0> = GT!A<0>) A<0> 

* 

TOP TOP = .A<A<0» 


: F!FRETURN)S!NRETURN) 


: S!RETURN)F!FRETURN) 
: (NRETURN) 


A typical list could be defined by the statement 
X = ARRAY!'0;10',0) 
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Solution 6^ 


SANCHOR = 1 
STRIM = 1 
PROG = 

CONT = ’ + ' I ' . ’ 

READ STRING = INPUT 
STRING ’EOF' 

STRING 

STRING CONT = 

PROG = PROG ';’ STRING 
CONTINUE PROG = PROG STRING 
EXEC PROG = CODE(PROG) 

END 


:F(ERR) 

:S(EXEC) 

:S(READ) 

:S(CONTINUE) 
:(READ) 

: (READ) 

; S<PROG>F(ERR) 


Solution 8.1 

DEFINE('TR(VAL,TAG)NO,T,N,Q') 
OPSYN('D’,'DATATYPE’) 

6TRACE = 10000 


TR 


TRA 

+ 

+ 


NO 

T 

N 

N 

Q 

OUTPUT 


= 6LASTNO 
= TIMEO 

= IDENT(D(VAL),'NAME’) TAG 
= IDENT(D(VAL),'STRING’) VAL 
= IDENT(D($VAL),'STRING') "'" 

= 'STATEMENT ' NO ';' N ' = ' 

Q CONVERT($VAL,'STRING') Q 
', TIME = ' T ;(RETURN) 


:S(TRA) 


Solution 8.2 

6TRIM = 1 

STRACE = 1000 

DEPINE('PUT(V)I' ) 

INIT N = LT(N,50) N + 1 :F(TEST) 

TRACE('COL' N,,,'PUT') :(INIT) 
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PUT 

TEST 


Solution 


OUTOPF 


V 'COL' REM . I :F(ERR) 

OUTPUT = DUPL('',!) $V :(RETURN) 


83 


6TRACE = 10000 

DEFINE('OUTOFP(X)') 

TRACE('OUTPUT',,,'OUTOPF') 


IDENT($X,'EOF') :F(RETURN) 

eOUTPUT = 0 STOPTR('OUTPUT') 


(RETURN) 
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Index 


SABEND 129 

ABORT 49 

SABORT 128 

addition ( +) 2, 89, 145 

SALPHABET 80, 128 

alternation ( | ) 8, 25, 89, 145 

6ANCHOR 29, 32, 128, 129 

anchored mode 31 

anchored pattern matching 31 

ANY 40 

APPLY 84 

ARB 56, 66 

SARB 128 

ARBNO 58, 67 

ARC 161 

arguments (see function call) 
arithmetic 2 
arithmetic expressions 2 
arithmetic operators 
addition ( +) 2, 89, 145 
division ( /) 2, 89, 145 
exponentiation ( *♦ or I ) 2, 89, 145 
multiplication ( ♦) 2, 89, 145 
subtraction ( -) 2, 89, 145 
unary minus ( -) 2, 89, 145 
unary plus ( +) 2, 89, 145 
ARRAY 113, 119, 120, 122 
arrays 

data type 18, 113, 119, 122, 139, 141 
dimensions 17, 113, 120 
elements 113 
indexing 120 
initial value 18, 113 
references 18, 115, 121, 193 
assignment statements 1, 146 
associative arrays 118 
associativity 89, 104, 174 


automatic conversion of data tyf)es 
EXPRESSION to PATTERN 148 
INTEGER to PATTERN 148 
INTEGER to REAL 148 
INTEGER to STRING 147 
REAL to INTEGER 148 
REAL to PATTERN 148 
REAL to STRING 147 
STRING to INTEGER 148 
STRING to PATTERN 148 
STRING to REAL 148 


BACKSPACE 170 
BAL 60 
SBAL 128 

balanced expressions 60 
BCD 81 

bead diagrams 26, 41, 44, 56 
binary operators 88, 89, 145, 174 
addition ( +) 2, 89, 145 
alternation ( | ) 8, 25, 89, 145 
concatenation (blank) 4, 8, 25, 89, 145 
conditional value 

assignment ( .) 9, 32, 89, 145, 177 
division ( /) 2, 89, 145 
exponentiation ( ♦♦ or ! ) 2, 89, 145 
immediate value 

assignment ( $) 10, 33, 89, 145 
multiplication ( ♦) 2, 89, 145 
subtraction ( -) 2, 89, 145 
unused ( -i? X # 3 £ ) 89 
BREAK 38 
break characters 38 
bubble sort 116 
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call tracing 153 
cancellation termination 190 
carriage control 167 
CLEAR 196 

clearing variable values 196 
CODE 

data type 139 
function 135, 172 
COLLECT 195 
comment line 20 
compilation 

source program 172, 176, 191 
during execution 135 
compilation errors 175 
concatenation (blank) 4, 8, 25 
concatenation with null string 78, 145, 147 
conditional goto 10 

conditional value assignment ( . ) 9, 32, 

177 

conditionally fatal errors 180 
continuation lines 20 
control lines 
EJECT 173 
LIST 173 
LEFT 173 
RIGHT 173 
UNLIST 173 
conversion of data types 
explicit 140 
implicit 147 
CONVERT 122, 140 
COPY 119 

created names 133, 139, 141 
created variables 131 
cursor 26, 30 

cursor position ( a) 36, 145 

DATA 19, 123, 139 
data set reference numbers 166 
data sets 166 
data type conversion 
explicit 140 
implicit 147 

data type representation 139 


data types 

array 18, 113, 119, 122, 139, 141 
created name 133, 139, 141 
external 85, 139, 141 
integer 2, 78, 139, 141, 147, 148 
object code 135, 139, 141 
pattern structure 24, 25, 26, 79, 139, 141, 
148, 193 

programmer-dehned 

data type 19, 123, 139, 141 
real number 3, 79, 139, 141, 147, 148 
string 3, 79, 139, 141, 147, 148, 192 
table 19, 118, 139, 141 
unevaluated expression 50, 62, 83, 139, 
141, 148 

data types of arguments 
binary operators 145 
primitive functions 143 
unary operators 144 
data types of operations 142 
binary 145 
unary 144 

data types of primitive functions 143 

DATATYPE 139 

DATE 83 

DD names 166 

DD cards 166 

DEFINE 14, 92, 162 

degenerate statements 177 

DETACH 170 

diagnostics 149, 163 

DIFFER 14, 79 

direct gotos 135 

discontinuation of tracing 159 

division ( /) 2, 89, 145 

DUMP 163 

6 DUMP 17, 130, 183 

DUPL 12, 42, 82 


EBCDIC 80, 81 
efficiency 193 
EJECT 173 
END 176 
end of 61e 172 
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end statement 1, 172, 176 
ENDFILE 170 
entry point 14, 92 
error conditions 

conditionally fatal 180 
unconditionally fatal 180 
error control 187 
6ERRLIMIT 130, 158, 180, 187 
6ERRTYPE 128, 180 
error messages 
compilation 175 
program 180 
error termination 187 
EQ 77 
equal sign 1 
EVAL 50, 83 

evaluation of expressions 177 
evaluation of statements 177 
execution of object code 135 
exponentiation ( ♦♦ or !) 2, 89, 145 
expressions 
arithmetic 2 
balanced 60 
evaluation of 177 
unevaluated 50, 62, 83 
extended syntax for IBM 360 202 
external functions 
calls 86 

data type 86, 139 
library 86 
loading 85 
prototype 86 
unloading 87 

FAIL 47 
&FAIL 129 
failure 

during input 10, 165 
in expression evaluation 178 
in pattern matching 10, 49, 179 
of array references 18, 115, 121, 177 
of functions 13, 77, 94, 179 
fatal errors 

conditional 180 


unconditional 180 
FENCE 49 
£FENCE 129 
FIELD 161 

held functions 19, 124, 126 
Held name 124 

Helds (see programmer-deHned 
data types) 

Hies (see I/O) 
flow of control 10, 134 
fiFNCLEVEL 129, 158 
forcing storage regeneration 195 
formal arguments 14, 92, 95 
formal identiflcation 139, 140 
formats 167 
FORTRAN IV 166 
FRETURN 15, 94 
fiFTRACE 130, 154, 160 
fiFULLSCAN 29, 62, 130 
fullscan mode 72 
function calls 

arguments of 12, 14, 76, 92, 95 
failure of 13, 77, 94, 179 
level of 153 

preservation of values 95, 99 
recursive 16, 98 
restoration of values 95 
value of 77, 93, 94 
function deflnitions 
DEFINE 14, 92, 162 
entry points 14, 92 
function names 14, 92, 93, 95 
formal arguments 14, 92, 95 
local variables 14, 92, 95 
prototype 14, 92 
function name as argument 101 
function procedures 14, 76, 92, 93 
primitive 76 

programmer-deflned 15, 93, 134 
external 85 
FRETURN 14, 94 
NRETURN 95, 134 
RETURN 15, 94, 153 
returning value 93 
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function synonyms 87 
function tracing 153 

GE 77 

generated variables 11, 131 
gotos 10, 134 
conditional 10 
failure 10 
success 10 
direct 135 
processing of 177 
unconditional 11 
GT 77 


hash addressing 192 
heuristics 62 


I DENT 14, 79 
identifiers 198 

immediate value assignment ( $) 10, 33 
implementation 191 
implicit alternatives 56 
implicit names 133 

indirect reference ( $) 11, 131, 192, 193 
in goto 11, 131 
infix notation 102, 104 
initial position of cursor 31 
initial value of 
array elements 18, 113 
variables 131 
INPUT 

function 169 

variable 5, 10, 17, 165, 178 
6 INPUT 130, 171, 195 
input association 169 
input data 5, 20, 164, 195 
INTEGER 78 
integer data type 2, 139 
integers 1, 2, 79 
interpreter 191 
interrogation (?) 84 
ITEM 121 


I/O 

functions 

BACKSPACE 170 
DETACH 170 
ENDFILE 170 
INPUT 169 
OUTPUT 167 
REWIND 170 
keywords 

6INPUT 130, 171, 195 
60UTPUT 130, 171, 195 
system 166 
FORTRAN IV 166 


job control language ( JCL) 166 


keywords 17, 128 
SABEND 129 
£AB0RT 128 
6ALPHABET 80, 128 
£ ANCHOR 29, 30, 31, 128, 129 
£ARB 128 
£BAL 128 
SCODE 129 
£DUMP 17, 130, 183 
fiERRLIMIT 130, 158, 180 
fiERRTYPE 128, 180 
£FAIL 129 
£FENCE 129 
£FNCLEVEL 129, 158 
£FTRACE 130, 154, 160 
£FULLSCAN 29, 62, 130 
£INPUT 130, 171, 195 
£LASTNO 129, 160 
£MAXLNGTH 130 
£OUTPUT 130, 171, 195 
£REM 129 

£RTNTYPE 129, 160 
£STCOUNT 129, 158 
£STFCOUNT 129, 158 
£STLIMIT 130 
£STN0 129, 160 
£SUCCEED 129 





Index 253 


6TRACE 130, 149, 159, 160 
STRIM 17, 82, 130, 195 
protected 128 
unprotected 129 
keyword operator ( £) 17, 128 
keyword tracing 158 

label 10, 134, 177, 198 
attribute 134 
tracing 157 
6LASTNO 129, 160 
LE 13, 77 
left recursion 67 
LEN 13, 37 
LGT 14, 80 
LIST 173 
listing control 173 
EJECT 173 
LIST 173 
LEFT 173 
RIGHT 173 
UNLIST 173 
literals 2, 6, 198 
LOAD 85 
LOCAL 161 

local variables 14, 92, 95 
loop control 13, 77 
loops in pattern matching 67 
LT 77 


£MAXLNGTH 130 
modes of scanning 29 
multiplication ( ♦) 2, 89, 145 

name operator ( . ) 89, 132, 145 
names 130 

natural variables 130, 151, 163, 192 

NE 77 

needle 26 

negation (-i) 84 

normal termination 183 

NOTANY 40 


NRETURN 95, 131, 134 
null string 4, 13, 35, 44, 77 
numeral strings 4 
numerical predicates 77 
EQ 77 
GE 77 
GT 77 

INTEGER 78 
LE 13, 77 
LT 77 
NE 77 


object 7, 24, 146, 177 
object code 135, 172, 191 
object comparison predicates 79 
DIPPER 14, 79 
IDENT 14, 79 
LGT 14, 80 
object evaluation 177 
omitted arguments 13, 17, 77, 96 
operator definition 87 
operator precedence 2, 174 
operator synonyms 87 
OPSYN 87, 162 
OS/360 166 
OUTPUT 

function 167 
variable 5, 34, 164 
gOUTPUT 130, 171, 195 
output association 167 

parentheses 3, 174 
passing names 133 
pattern evaluation 24, 177 
pattern matching 6, 24, 177, 193 
pattern matching statements 6, 24, 146 
pattern structures 24, 25, 26, 79, 193 
left recursion 67 
pattern-valued expressions 8 
pattern-valued functions 13, 37 
patterns 6, 7, 8, 24, 146 
Polish prefix notation 102, 104, 191 
POS 44 
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precedence 2, 89, 174 
predicates (see numerical and 
object comparison) 
primitive functions 12, 76, 143 
ANY 40 
APPLY 84 
ARBNO 58, 67 
ARG 161 

ARRAY 113, 119, 120 
BACKSPACE 170 
BREAK 38, 44 
CLEAR 196 
CODE 135, 172 
COLLECT 195 
CONVERT 120, 140 
COPY 119 
DATA 19, 123, 140 
DATATYPE 139 
DATE 83 

DEFINE 14, 92, 162 
DETACH 170 
DIFFER 14, 79 
DUMP 163 
DUPL 12, 42, 82 
ENDFILE 170 
EQ 77 

EVAL 50, 83 
FIELD 161 
GE 77 
GT 77 

IDENT 14, 79 
INPUT 169 
INTEGER 78 
ITEM 121 
LE 13, 77 
LEN 13, 37 
LGT 14, 80 
LOCAL 161 
LT 77 
NE 77 
NOTANY 40 
OPSYN 87, 162 
OUTPUT 167 
POS 44 

PROTOTYPE 120 


REMDR 82, 99 
REPLACE 12, 81 
REWIND 170 
RPOS 44 
RTAB 41 
SIZE 12, 81 
SPAN 38 
STOPTR 159 
TAB 41 

TABLE 19, 118, 122, 139 

TIME 83 

TRACE 149 

TRIM 82 

VALUE 126 

primitive pattern structures 
ABORT 49 
ARB 56, 67 
BAL 60 
PAIL 47 
FENCE 49 
REM 41 
SUCCEED 61 
printed output 164, 167 
program error messages 180 
program execution 176 
program format 20 
program statistics 183 
program termination 172, 183 
program mer-defined 
data types 19, 123 
DATA 19, 123, 139 
FIELD 161 
prototype 123 
functions 

DEFINE 14, 92, 162 
entry label 14, 92 
formal arguments 14, 92, 95 
FRETURN 15, 94 
local variables 14, 92, 95 
name 14, 92, 93, 95 
NRETURN 95, 131, 134 
procedure 15, 93 
prototype 14, 92 
RETURN 14, 94, 153 
trace procedures 160 
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programming details 191 
protected keywords 128 
PROTOTYPE 120 
prototypes 

arrays 113, 120, 122 
external functions 86 
programmer-defined data types 123 
programmer-defined functions 14, 92 
syntax of 200 
PUNCH 5, 165 
punched output 165 

quickscan mode 62 
quotation marks 1, 3, 133 

real numbers 3, 79 

recursive function calls 16, 98 

recursive pattern definitions 53, 54 

redefinition of functions 97 

REM 41 

£REM 129 

REMDR 82, 99 

REPLACE 12, 81 

replacement 177 

replacement statements 7, 24, 35, 146 

RETURN 14, 94, 134 

RETURN tracing 153 

returning a variable 134 

REWIND 170 

RPOS 44 

RTAB 41 

gRTNTYPE 129, 161 


scanner 26, 47, 62 
scanning 26 
SIZE 12, 81 
SNOBOL4 syntax 197 
source listing 173 
source program 172 
SPAN 38 
statements 
assignment 1, 146 


end 1, 20, 172, 176 
degenerate 177 
pattern matching 6, 24, 146 
replacement 7, 24, 35, 146 
statement components 146 
statement continuation 20 
statement evaluation 177 
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